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PREFACE TO THIRD EDITION 

Chb book has again bora mostly r ewritt en to bring,in various 
approvements. The ohief of these is the use of the notation of bra 
iad ket vectors, which I have developed sinoe 1939. This notation 
illows a mare direct connexion to be made between the formalism 
n terms of the abstract quantities corresponding to states and 
observables and the formalism in terms of representatives — in fact 
Ihe two formalisms become welded into a single comprehensive 
icheme. With the help of this notation several of the deductions in 
die book take a simpler and neater form. 

Other substantial alterations indude: . 

(i) A new presentation of the theory of systems with similar 
particles, based on Fook’s treatment of the theory of radiation 
adapted to the present notation. This treatment is simpler and more 
powerful than the one given in earlier editions of the book. 

(ii) A further development of quantum deotrodynamios, including 
the theory of the Wentzel field. The theory of the electron in inter- 
action with the dectromagnetio field is carried as for as it oan he at 
the present time without getting on to speculative ground. 

P. A. M. D. 

sx. John’s ooixbob, Cambridge 
' 21 April 1947 
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FROM THE 

PREFACE TO THE SECOND EDITION 

The book has been mostly rewritten. I have tried by carefully over- 
hauling the method of presentation to give the development of the 
theory in a rather less abstract form, without making any sacrifices 
in exactness of expression or in the logical character of the develop- 
ment. This should make the work suitable for a wider circle of 
readers, although the reader who likes abstractness for its own sake 
may posMbly prefer the style of the first edition. 

The main change has been brought about by the use of the word 
'state’ in a three-dimensional non-relativistio sense. It would seem 
at first sight a pity to build up the theory largely on the basis of non- 
relativistio concepts. The use of the non-relativistio meaning of 
'state*, however, contributes so essentially to the possibilities of 
dear exposition as to lead one to suspect that the fundamental ideas 
of the present quantum mechanics are in need of serious alteration at 
just this point, and that an improved theory would agree more closely 
with the development here given than with a development whioh 
aims at preserving the relativistic meaning of ‘state’ throughout. 

P. A. M. D. 

TUB INSTITUTE FOB ADVANCED STUDY 
PRINCETON 

27 November 1934 



FROM THE 

PREFACE TO THE FIRST EDITION 

The methods of progress in theoretioal physios have undergone a 
vast change during the present century. The classical tradition 
has been to consider the world to be an association of observable 
objects (particles, fluids, fields, etc.) moving about according to 
definite laws of foroe, so that one could form a mental picture in 
spaoe and time of the whole scheme. This led to a physios whose aim 
was to make assumptions about the mechanism and forces connecting 
these observable objects, to acoount for their behaviour in the 
simplest possible way. It has become increasingly evident in recent 
times, however, that nature works on a different plan. Her funda- 
mental laws do not govern the world as it appears in our mental 
pioture in any very direct way, but instead they control a substra- 
tum of which we cannot form a mental pioture without intro- 
ducing irrelevanoies. The formulation of these laws requires the use 
of the mathematics of transformations. The important things in 
the world appear as the invariants (or more generally the nearly 
invariants, or quantities with simple transformation properties) 
of these transformations. The things we are immediately aware of 
are the relations of these nearly invariants to a certain frame of 
reference, usually one chosen so as to introduce special simplifying 
features which are unimportant from the point of view of general 
theory. 

The growth of the use of transformation theory, as applied first to 
relativity and later to the quantum theory, is the essence of the new 
method in theoretical physics. Further progress lies in the direction 
of making our equations invariant under wider and still wider trans- 
formations. This state of affairs is very satisfactory from a philo- 
sophical point of view, as implying an increasing recognition of the 
part played by the observer in himself introducing the regularities 
that appear in his observations, and a lack of arbitrariness in the ways 
of nature, but it makes things less eafcy for the learner of physics. 
The new theories, if one looks apart from their mathematical setting, 
are built up from physical concepts which cannot be explained in 
terms of things previously known to the student, which cannot even 
be explained adeqgptefy in words at all. Like the fundapipntal con- 
cepts (e.g. proximity, identity) which every one must learn on his 
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arrival into the world, the newer oonoepts of physios can be mastered 
. only by long familiarity with th e ir proper t ies and uses. 

From the mathematical side the approach to the new theories 
p resents no difficulties, as the mathematics required (at any rate that 
yrhirih is required for the development of physios up to the present) 
is not essentially different from what has been current for a consider* 
able time. Mathematics is the tool specially suited for dealing with 
abstract oonoepts of any kind and there is no limit to its power in this 
field. For, this reason a book on the new physios, if not purely descrip- 
tive of experimental work, must be essentially mathematical. All the 
same the mathematics is only a tool and one should learn to hold the 
physioal ideas in one’s mind without reference to the mathematical 
form. In this book I have tried to keep the physios to the forefront, 
by beginning with an entirely physioal chapter and in the later work 
examining the physioal meaning underlying the formalism wherever 
possible. The amount of theoretical ground one has to cover before 
being able to solve problems of real practical value is rather large, but 
this circumstance is an inevitable consequence of the fundamental 
part played by transformation theory and is likely to become more 
pronounoed in the theoretical physics of the future. 

With regard to the mathematical form in which the theory can be 
presented, an author must decide at the outset between two methods. 
There is the symbolic method, which deals directly in an abstract way 
with the quantities of fundamental importance (the invariants, etc., 
of the transformations) and there is the method of coordinates or 
representations, which deals with sets of numbers corresponding to 
these quantities. The second of these has usually been used for the 
presentation of quantum mechanics (in fact it has been used practi- 
cally exxhlsively with the exception of Weyl’s book Qruppentheorie 
and QwtUenmechanik). It is known under one or other of the two 
names ‘Wave Mechanics' and 'Matrix Mechanics' according to which 
• physical tilings receive emphasis in the treatment, the states of a 
System or its dynamioal variables. It has the advantage that the kind 
of mathematics required is more familiar to the average student, and 
also it Is the hist ori c a l method. 

.. The ^ymboUe method, however, seems to go mom deeply into the 
; aetam of things. It enables one to express the physioal lews in a neat 
mdeoMim wd wffl probably he fcunmategly need in the fiitnm 

bgHay qadess tood an d its own s pecial anflaantioB gets 
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developed. For this reason I have chosen the symbolic -method, 
introducing the representatives later merely as an aid to praotioal 
calculation. This has necessitated a oomplete break from the histori- 
cal line of development, bat this break is an advantage through 
anaWing the approach to the new ideas to be made as direct as 
possible. 

P. A. M. D. 

ST. JOHN’S COLLEGE, CAMBRIDGE 

29 May 1930 
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I 

THE PRINCIPLE OF SUPERPOSITION 
1. The need for a quantum theory 

Classical mechanics has been developed continuously from the time 
of Newton and applied to an ever-widening range of dynamical 
systems, including the electromagnetic field in interaction with 
matter. The underlying ideas and the laws governing their applica- 
tion form a simple and elegant scheme, which one would be inolined 
to think could not be seriously modified without having all its 
attractive features spoilt. Nevertheless it has been found possible to 
set up a new scheme, called quantum mechanics, whioh is more 
suitable for the description of phenomena on the atomic scale and 
whioh is in some respects more elegant and satisfying than the 
classical scheme. This possibility is due to the ohanges which the 
new scheme involves being of a very profound character and not 
niching with the features of the classical theory that make it so 
attractive, as a result of which all these features can be incorporated 
in the new soheme. 

The necessity for a departure from classical mechanics is dearly 
Bhown by experimental results. In the first place the forces known 
in classical electrodynamics are inadequate for the explanation of the 
remarkable stability of atoms and molecules, whioh is neoessary in 
order that materials may have any definite physical and chemical 
properties at all. The introduction of new hypothetical forces will not 
save the situation, since there exist general principles of classical 
mechanics, holding for all kinds of forces, leading to results in direct 
disagreement with observation. For example, ifan atomic system has 
its equilibria^ disturbed in toy way and is then left alone, it will be set 
in oscillation and the oscillations will get impressed on the surround- 
ing electromagnetic field, so that their frequencies may be observed 
with a spectroscope. Now whatever the laws of foroe governing the 
equilibrium, one would expect to be able to inolude the various fre- 
quencies in a scheme comprising certain fundamental frequencies and. 
their har monics. This is not observed to be the case. Instead, there 
is o bser v e d a new and unexpected connexion between the frequencies, 
called Rita’s Combination Law of Spectroscopy, according tq which all 
the frequencies can be expressed as difference* between certain terms, 

saw b 
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the number of terms being much less than the number of frequencies. 
This law is quite unintelligible from the classical standpoint. 

One might try to get over the difficulty without departing from 
flfiffiriiMil mechanics by assuming each of the spectroscopically ob- 
served frequencies to be a fundamental frequency with its own degree 
of freedom, the laws of foroe being suoh that the harmonic vibrations 
do not occur. Such a theory will not do, however, even apart from 
the fact that it would give no explanation of the Combination Law, 
since it would immediately bring one into conflict with the experi- 
mental evidence on specific heats. Classical statistical mechanics 
enables one to establish a general connexion between the total number 
of degrees of freedom of an assembly of vibrating systems and its 
specific heat. If one assumes all the spectroscopic frequencies of an 
atom to correspond to different degrees of freedom, one would get a 
specific heat for any kind of matter very much greater than the 
observed value. In fact the observed specific heats at ordinary 
temperatures are given fairly well by a theory that takes into account 
merely the motion of each atom as a whole and assigns no internal 
motion to it at all. 

This leads us to a new clash between classical mechanics and the 
results of experiment. There must certainly be some internal motion 
in an atom to account for its spectrum, but the internal degrees of 
freedom, for some classically inexplicable reason, do not contribute 
to the specific heat. A similar clash is found in connexion with the 
energy ofosdllation of the electromagnetic field in a vacuum. Classical 
mechanics requires the specific heat corresponding to this energy to 
be infinite, but it is observed to be quite finite. A general conclusion 
from experimental results is that oscillations of high frequency do 
not contribute their classical quota to the specific heat. 

As another illustration of the failure of classical mechanics we may 
consider the behaviour of light. We have, on the one hand, the 
^phenomena of interference and diffraction, which can be explained 
only on the basis of a wave theory; on the other, phenomena suoh as 
photo-electric emission and scattering by free electrons, which show 
that light is composed of small particles. These particles, which 
are called photons, have each a definite energy and momentum, de- 
pending on the frequency of the light, and appear to have just as 
&A - « existence as electrons, or any other particles known in physics. 
Jk fraction of a photon is never observed. 
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Experiments have shown that this anomalous behaviour is not 
peculiar to light, but is quite general. All material partioles have 
wave properties, which can be exhibited under suitable conditions. 
We have here a very striking and general example of the breakdown 
of classical mechanics — not merely an inacouracy in its laws of motion, 
but an inadequacy of its concepts to supply us with a description of 
atomic events . 

The necessity to depart from classical ideas when one wishes to 
account for the ultimate structure of matter may be seen, not only 
from experimentally established facts, but also from general philo- 
sophical grounds. In a classical explanation of the constitution of 
matter, one would assume it to be made up of a large number of small 
constituent parts and one would postulate laws for the behaviour of 
these parts, from which the laws of the matter in bulk could be de- 
duced. This would not complete the explanation, however, sinoe the 
question of the structure and stability of the constituent parts is left 
untouched. To go into this question, it becomes necessary to postu- 
late that eaoh constituent part is itself made up of smaller parts, in 
terms of which its behaviour is to be explained. There is dearly no 
end to this procedure, so that one can never arrive at the ultimate 
structure of matter on these lines. So long as big and smalt are merely 
relative oonoepts, it is no help to explain the big in terms of the small. 
It is therefore necessary to modify classical ideas in such a way as to 
give an absolute meaning to size. 

At thus stage it becomes important to remember that science is 
concerned only with observable things and that we can observe an 
object only by letting it interact with some outside influence. An act 
of observation is thus necessarily accompanied by some disturbance 
of the object observed. We may define an object to be big when the 
disturbance accompanying our observation of it may be neglected, 
and small when the disturbance cannot be neglected. This definition 
is in dose agreement with the common meanings of big and small. 

It is usually assumed that, by being careful, we may out down the' 
disturbance accompanying our observation to any desired extent. 
The concepts of big and small are then purely relative and refer to the- 
gentleness of our means of observation as well as to the object being 
described. In order to give an absolute meaning to size, such as is 
required for any thgqry Of the ultimate structure of matte%we have 
to assume that there is a limit io the fineness of our powers Observation 
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and the smallness of the accompanying disturbance— a limit wMchis 
inherent in the nature of thing s and can never be surpassed by improved 
technique or increased skill cm the part of the observer. Ifthe object under 
observation is such that the unavoidable limiting distur bance is negli- 
gible, then the object is big in the absolute sense and we may apply 
classical mechanics to it. If, on the other hand, the limiting dis- 
turbance is not negligible, then the object is small in the absolute 
sense and we require a new theory for dealing with it. 

A oonsequenoe of the preceding discussion is that we must revise 
our ideas of causality. Causality applies only to a system whioh is 
left undisturbed. If a system is small, we cannot observe it without 
producing a serious disturbance and hence we cannot expect to find 
any causal connexion between the results of our observations. 
Causality will still be assumed to apply to undisturbed systems and 
the equations whioh will be set up to describe an undisturbed system 
will be differential equations expressing a causal connexion between 
conditions at one time and conditions at a later time. These equations 
will be in dose correspondence with the equations of 
mechanics, but they will be connected only indirectly with the results 
of observations. There is an unavoidable indeterminacy in the calcu- 
lation of observational results, the theory enabling us to calculate in 
general only the probability of our obtaining a particular result when 
we make an observation. 


2. The polarization of photons 
The discussion in the preceding section about the limit, to the 
gentleness with whioh observations oan be made and the consequent 
indeterminacy in the results of those observations does not provide 
any quantitative basis for the building up of quantum mechanics. 
For this purpose a new Bet of accurate laws of nature is required. 
One of the most fundamental and most drastic of these is die Principle 
Jf Superposition of States. We shall lead up to a general formulation 
of this principle through a consideration of some special c anes , taking 
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•’electrons, there is a preferential direction for the 
i. Thus the p ol a ri zati o n properties of light are closely 
its corpuscular properties and one must ascribe a 
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of light plane -polarized in a oertain direction as c onsis ti n g of photons 
each of which is plane-polarized in that direction and a beam of 
circularly p ol arized light as consisting of photons each circularly 
polarized. Every photon is in a certain state of polarization, as we 
aL all say. The problem we must now oonsider is how to fit in these 
ideas with the known facts, about the resolution of light into polarized 
components and the recombination of these components. 

Let us take a definite case. Suppose we have a beam of light passing 
through a crystal of tourmaline, which has the property of letting 
through only light plane-polarized perpendicular to its optio axis. 
Classical electrodynamics tells us what will* happen for any given 
polarization of the incident beam. If this beam is polarized per- 
pendicular to the optic axis, it will all go through the crystal; if 
parallel to the axis, none of it will go through; while if polarized at 
an angle a to the axis, a fraction sin a a will go through. How are we 
to understand these results on a photon basis ? 

A beam that is plane-polarized in a oertain direction is to be 
pictured as made up of photons each plane-polarized in that 
direction. This picture leads to no difficulty in the cases when our 
incident beam is polarized perpendicular or parallel to the optic axis. 
We merely have to suppose that each photon polarized perpendicular 
to the axis passes unhindered and unchanged through the crystal, 
while eaoh photon polarized parallel to the axis is stopped and ab- 
sorbed. A difficulty arises, however, in the case of the obliquely 
polarized incident beam. Eaoh of the incident photons is then 
obliquely polarized and it is not clear what will happen to such a 
photon when it reaches the tourmaline. 

A question about what will happen to a particular photon under 
oertain conditions is not really very precise. To make it precise one 
. must imag ine some experiment performed having a bearing on the 
question and inquire what will be the result of the experiment. Only 
questions about the results of experiments have a real significance 
and it is only such questions that theoretical physios has to oonsider. 

In our present example the obvious experiment is to use an incident 
beam consisting of only a single photon and to observe what appears * 
on the back side of the crystal According to quantum^eohanioB 
the result of this experiment will be that sometimes aneyfi l md a 
whole photon, of energy equal to the energy of the inddea^ldtetofl, 
on the back side other times one will find nothing. When one 
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flnria a whole photon, it will be polarized perpendicular to the optic 
axis. One will never find only a part of a photon on the back side. 
If one repeats the experiment a large number of times, one will find 
the photon on the back side in a fraction sin a a of the total number 
of times. Thus we may say that the photon has a probability sin 1 ** 
of passing through the tourmaline and appearing on the back side 
polarized perpendicular to the axis and a probability cos 1 ** of being 
absorbed. These values for the probabilities lead to the correct 
classical results for an incident beam containing a large number of 
photons. ' 

In this way we preserve the individuality of the photon in all 
cases. .We are able to do this, however, only because we abandon the 
determinacy of the classical theory. The result of an experiment is 
not determined, as it would be according to classical ideas, by the 
conditions under the control of the experimenter. The most that can 
be predicted is a set of possible results, with a probability of occur- 
rence for each. 

The foregoing discussion about the result of an experiment with a 
single obliquely polarized photon incident on a crystal of tourmaline 
answers all that can legitimately be asked about what happens to an 
obliquely polarized photon when it reaohes the tourmaline. Questions 
about what decides whether the photon is to go through or not and 
how it changes its direction of polarization when it does go through 
cannot be investigated by experiment and should be regarded as 
outside the domain of science. Nevertheless some further description 
is necessary in order to correlate the results of this experiment with 
the results of other experiments that might be performed with 
photons and to fit them all into a general scheme. Such further 
description should be regarded, not as an attempt to answer questions 
outside the domain of science, but as an aid to the formulation of 
rules for expressing concisely the results oflarge numbers of experi- 
^ments. 

' The further description provided by quantum mechanics runs as 
follows< lt is supposed that a photon polarized obliquely to the optic 
axis may be regarded as being partly in the state of polarization 
parallel to the axis and partly in the state of polarization perpen- 
dicular to the axis. The state of oblique polarization may be con- 
sidered as ^.eesedt of some kind of superposition process applied to 

perpendicular polarization. T|is implies 
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a certain special kind of relationship between the various states of 
polarization, a relationship similar to that between polarized beams in 
classical optics, but whioh is now to be applied, not to beams, but to 
the states of polarization of one particular photon. This relationship 
allows any state of polarization to be resolved into, or expressed as a 
superposition of, any two mutually perpendicular states of polari- 
zation. 

When we make the photon meet a tourmaline crystal, we are sub- 
jecting it to an observation. We are observing whether it is polarized 
parallel or perpendicular to the optio axis. The effect of making this 
observation is to force the photon entirely into the state of parallel 
or entirely into the state of perpendioular polarization. It has to 
make a sudden jump from being partly in each of these two states to 
being entirely in one or other of them. Which of the two states it will 
jump into cannot be predicted, but is governed only by probability 
laws. If it jumps into the parallel state it gets absorbed and if it 
jumps into the perpendicular state it passes through the crystal and 
appears on the other side preserving this state of polarization. 

3. Interference of photons 

In this section we shall deal with another example of superposition. 
We shall again take photons, but shall be concerned with their posi- 
tion in space and their momentum instead of their polarization. If 
we are given a beam of roughly monochromatic light, then we know 
something about the location and momentum of the associated 
photons. We know that each of them is located somewhere in the 
region of space through which the beam is passing and has a momen- 
tum in the direction of the beam of magnitude given in terms of the 
frequency of the beam by Einstein’s photo-electric law — momentum 
equals frequency multiplied by a universal constant. When we have 
such information about the location and momentum of a photon we 
shall say that it is in a definite trandatioTial state. 

We shall discuss the description whioh quantum mechanics pro- * 
vides of the interference of photons. Let us take a definite experi- 
ment demonstrating interference. Suppose we have a beam of light . 
which is passed through some kind of in te rferometer, so that it gets 
split up into two components and the two components are subse- 
quently made to interfere. We may, as in the preceding section, take 
an incident beam consisting of only a single photon and inquire what 
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will happen to it as it goes through the apparatus. This will present 
to us the difficulty of the conflict between the wave and corpuscular 
theories of light in an acute form. 

Corresponding to the description that we had in the case of the 
polarisation, we must now describe the photon as going partly into 
each of the two components into which the incident beam is split. 
The photon is then , as we may say , in a translational state given by the 
superposition of the two translational states associated with the two 
components. We are thus led to a generalization of the term ‘trans- 
lational state’ applied to a photon. For a photon to be in a definite 
translational state it need not be associated with one single beam of 
light, but may be associated with two or more beams of light which 
are the components into which one original beam has been split.f In 
the accurate mathematical theory each translational state is associated 
with 1 one of the wave functions of ordinary wave optios, which wave 
funotion may describe either a single beam or two or more beams 
into which one original beam has been split. Translational states are 
thus superposable in a similar way to wave functions. 

Let us consider now what happens when we determine the energy 
in one of the components. The result of such a determination must 
be, either the whole photon or nothing at all. Thus the photon must 
change suddenly from being partly in one beam and partly in the 
other to being entirely in one of the beams. This sudden ohange is 
due to the disturbance in the translational state of the photon whioh 
the observation necessarily makes. It is impossible to predict in whioh 
of the two beams the photon will be found. Only the probability of 
either result can be calculated from the previous distribution of the 
photon over the two beams. 

One could carry out the energy measurement without destroying the 
component beam by, for example , reflecting the beam from a movable 
mirror and observing thereooil. Our description of the photon allows 
m to infer that, after such an energy measurement, it would not be 
possible to bring about any interference effects between the two com - 
ponents. So long as the photon ir partly in one beam and partly in 
the other, interference can occur when the two beams are superposed, 
but this posribSft^ disappears when the photon is forced entirely into 

,■ ? ^S^The etotoatistaM tbet the cqp ctpoe lfci o n idea r e qo irac vm to genmttee our 
■ of teassjaEonel states, bu t that no oonap oodiDg ^pruf 

tbeetetee of polarisation of the pveeedkw actk a, is ea accidental one 
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one of the beams by an observation. The other beam then no longer 
enters into the description of the photon, so that it counts as being . 
entirely in the one beam in the ordinary way for any experiment that 
may subsequently be performed on it. 

On these lines quantum mechanics Is able to effect a reconciliation 
of the wave and corpuscular properties of light. The essential point 
is the association of each of the translational states of a photon with 
one ofthe wave functions of ordinary wave optios. The nature of this 
association cannot be pictured on a basis of classical mechanics, but 
is something entirely new. It would be quite wrong to pioture the 
photon and its associated wave as interacting in the way in which 
particles and waves can interact in classical mechanics. The associa- 
tion can be interpreted only statistically, the wave function giving 
us information about the probability of our finding the photon in any 
particular place when we make an observation of where it is. 

Some time before the discovery of quantum mechanics people 
realized that the connexion between light waves and photons must 
be of a statistical character. What they did not' clearly realize, how- 
ever, was that the wave function gives information about the proba- 
bility of one photon being in a particular place and not the probable 
number of photons in that place. The importance of the distinction 
can be made clear in the following way. Suppose we have a beam 
of light consisting of a large number of photons split up into two com- 
ponents of equal intensity. On the assumption that the intensity of 
a beam is connected with the probable number of photons in it, we 
should have half the total number of photons going into each com- 
ponent. If the two components are now made to interfere, we should 
1 require a photon in one component to be able to interfere with one in 
the other. Sometimes these two phQtons would have to annihilate one 
another and other times they would have to produoe four photons. 
This would contradict the conservation of energy. The new theory, 
which connects the wave function with probabilities for one photon, 
gets over the difficulty by making each photon go partly into each of 
the two components. Each photon then interferes only with itself. 
Interference between two different photons never occurs. 

The association of particles with waves discussed above is not 
„ restricted to the cade of light, but is, according to modern theory, 
* of universal applicability. AD kinds of particles are associated with 
, waves in this way and conversely all wave motion irassoc&tted with 
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particles. Thus all particles can be made to exhibit interference 
effects and all wave motion has its energy in the form of quanta. The 
reason why these general phenomena are not more obvious is on 
acoount of a law of proportionality between the mass or energy of the 
particles and the frequency of the waves, the coefficient being such 
that for waves of familiar frequencies the associated quanta are 
extremely small, while for particles even as light as electrons the 
associated wave frequency is so high that it is not easy to demonstrate 
interference. 

4. Superposition and indeterminacy 

The reader may possibly feel dissatisfied with the attempt in the 
two preceding sections to fit in the existence of photons with the 
classical theory of light. He may argue that a very strange idea has 
been introduced — the possibility of a photon being partly in each of 
two states of polarization, or partly in each of two separate beams — 
but even with the help of this strange idea no satisfying picture of 
the fundamental single -photon processes has been given. He may say 
further that this strange idea did not provide any information about 
experimental results for the experiments discussed, beyond what 
could have been obtained from an elementary consideration of 
photons being guided in some vague way by waves. What, then, is 
the use of the strange idea? 

In answer to the first criticism it may be remarked that the main 
object of physical science is not the provision of pictures, but is the 
formulation of laws governing phenomena and the application of 
these laws to the discovery of new phenomena. If a picture exists, 
so muoh the better; but whether a picture exists or not is a matter 
of only secondary importance. In the case of atomic phenomena 
no picture can be expected to exist in the usual sense of the word 
‘picture’, by which is meant a model functioning essentially on 
classical lines. One may, however, extend the meaning of the word 
‘picture* to include any way of looking at the fundamental laws which 
makes their self-consistency obvious* With this extension, one may 
gradually acquire a picture of atomic phenomena by becoming 
familiar with the laws of the quantum theory. 

With regard to the second criticism, it may be remarked that for 
many simple experiments with light, an elementary theory of waves 
and photons connected ina vague statistical way would be jdequate 
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to account for the results. In the case of suoh experiments quantum 
mechanics has no further information to give. In the great majority 
of experiments, however, the conditions are too complex for an 
elementary theory of this kind to be applicable and some more 
elaborate scheme, suoh as is provided by quantum mechanics, is then 
needed. The method of description that quantum mechanics gives 
in the more complex cases is applicable also to the simple cases and 
although it is then not really necessary for accounting for the experi- 
mental results, its study in these simple cases is perhaps a suitable 
introduction to its study in the general case. 

There remains an overall criticism that one may make to the whole 
scheme, namely, that in departing from the determinaoy of the 
classical theory a great complication is introduced into the descrip- 
tion of Nature, which is a highly undesirable feature. This complica- 
tion is undeniable, but it is offset by a great simplification, provided 
by the general principle of superposition of states , which we shall now 
go on to consider. But first it is neoessary to make precise the impor- 
tant concept of a 4 state* of a general atomic system. 

Let us take any atomic system, composed of particles or bodies 
with specified properties (mass, moment of inertia, etc.) interacting 
according to specified laws of force. There will be various possible 
motions of the particles or bodies consistent with the laws of force. 
Each such motion is called a state of the system. According to 
classical ideas one could specify a state by giving numerical values 
to all the coordinates and velocities of the various component parts 
of the system at some instant of time, the whole motion being then 
completely determined. Now the argument of pp. 3 and 4 shows that 
we cannot observe a small system with that amount of detail which 
classical theory supposes. The limitation in the power of observation 
puts a limitation on the number of data that can be assigned to a 
state. Thus a state of an atomic system must be specified by fewer 
or more indefinite data than a complete set of numerical values 
for all the coordinates and velocities at some instant of time. In the 
case when the system is just a single photon,' a state would be com- 
pletely specified by a given state of motion in the sense of § 3 
together with a given state of polarization in the sense of § 2. 

A state of a system may be defined as an undisturbed motion that 
is restricted by act many conditions or data as are theoretically 
possible without mutual interference or contradiction. In practice 
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the conditions could be imposed by a suitable preparation of the 
system, consisting perhaps in passing it through various kinds of 
sorting apparatus, such as slits and polarimeters, the system being 
left undisturbed after the preparation. The word ‘state* may be 
used to mean either the state at one particular time (after the 
preparation), or the state throughout the whole of time after the 
preparation. To distinguish these two meanings, the latter will be 
called a ‘state of motion* when there is liable to be ambiguity. 

The general principle of superposition of quantum mechanics 
applies to the states, with either of the above meanings, of any one 
dynamical system. It requires us to assume that between these 
states there exist peculiar relationships suoh that whenever the 
system is definitely in one state we can consider it as being partly 
in each of two or more other states. The original state must be 
regarded as the result of a kind of superposition of the two or more 
new states, in a way that cannot be conceived on classical ideas. Any 
state may be considered as the result of a superposition of two or 
more other states, and indeed in an infinite number of ways. Con- 
versely any two or more states may be superposed to give a new 
state. The procedure of expressing a state as the result of super- 
position of a number of other states is a mathematical procedure 
that is always permissible, independent of any reference to physical 
conditions, like the procedure of resolving a wave into Fourier com- 
ponents. Whether it is useful in any particular case, though, depends 
on the special physical conditions of the problem under consideration. 

In the two preceding sections examples were given of the super- 
position principle applied to a system consisting of a single photon. 
§ 2 dealt with states differing only with regard to the polarization and 
§ 3 with Btates differing only with regard to the motion of the photon 
as a whole. 

The nature of the relationships which the superposition principle 
requires to exist between the states of any system is of a kind that 
cannot be explained in terms of familiar physical concepts. One 
cannot in the dassioal sense picture a system being partly in each of 
two states and see the equivalence of this to the system being com- 
pletely in some other state. There is an entirely new idea involved, 
to which one must get aocustomed and in terms of which one must 
proceed to build up an exact mathematical theory, without having 
any detailed classical picture. 
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When a state is formed by the superposition of two other states, 
it will have properties that are in some vague way intermediate 
between those of the two original states and that approach more or 
less closely to those of either of them according to the greater or less 

* weight’ attached to this state in the superposition process. The new 
state is completely defined by the two original states when their 
relative weights in the superposition process are known, together 
with a certain phase difference, the exact meaning of weights and 
phases being provided in the general case by the mathematical theory. 
In the case of the polarization of a photon their meaning is that pro- 
vided by classical optics, so that, for example, when two perpendicu- 
larly plane polarized states are superposed with equal weights, the 
new state may be circularly polarized in either direction, or linearly 
polarized at an angle fn-, or else elliptically polarized, according to 
the phase difference. 

The non-classical nature of the superposition process is brought 
out clearly if we consider the superposition of two states, A and B, 
such that there exists an observation which, when made on the 
system in state A, is certain to lead to one particular result, a say, and 
when made on the system in state B is certain to lead to some different 
result, b Bay. What will be the result of the observation when made 
on the system in the superposed state ? The answer is that the result 
will be sometimes a and sometimes 6, according to a probability law 
depending on the relative weights of A and B in the superposition 
process. .It will never be different from both a and 6 . The inter- 
mediate character of the state formed by superposition thus expresses 
itself through the probability of a particular result for an observation 
being intermediate between the corresponding probabilities for the original 
states, f not through the result itself being intermediate between the 
corresponding results for the original states . 

In this way we see that such a drastic departure from ordinary 
ideas as the assumption of superposition relationships between the 
states is possible only on account of the recognition of the importance 
of the disturbance accompanying an observation and of the conse- 
quent indeterminacy in the result of the observation. When an 
observation is made on any atomic system that is in a given state, 

t The probability of a particular result fop the state formed by superposition i* not 
always intermediate between those for the original states in the general oase when 
those for the Original states are not zero or unity, so there are restrictions on the 

* intermediatenees * of a state formed by superposition. 
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in general the result will not be determinate, i.e., if the experiment 
is repeated several times under identioal conditions several different 
results may be obtained. It is a law of nature, though, that if the 
experiment is repeated a large number of times, each particular result 
willJbe obtained in a definite fraction of the total number of times, so 
that there is a definite probability of its being obtained. This proba- 
bility is what the theory sets out to calculate. Only in special cases 
when the probability for some result is unity is the result of the 
experiment determinate. 

The assumption of superposition relationships between the states 
leads to a mathematical theory in which the equations that define 
a state are linear in the unknowns. In consequence of this, people 
have tried to establish analogies with systems in classical mechanics, 
such as vibrating strings or membranes, which are governed by linear 
equations and for which, therefore, a superposition principle holds. 
Such analogies have led to the name ‘Wave Mechanics’ being some- 
times given to quantum mechanics. It is important to remember, 
however, that the superposition that occurs in quantum mechanics is 
of an essentially different nature from any occurring in the classical 
theory, as is shown by the fact that the quantum superposition prin- 
ciple demands indeterminacy in the results of observations in order 
to be capable of a sensible physical interpretation. The analogies are 
thus liable to be misleading. 

4 

5. Mathematical formulation of the principle 

A profound change has taken plaoe during the present century in 
the opinions physicists have held on the mathematical foundations 
of their subject. Previously they supposed that the principles of 
Newtonian mechanics would provide the basis for the description 
of the whole of physical phenomena and,, that aH the theoretical 
physicist had to do was suitably to develop and apply these prin- 
"riples. With the recognition that there is no logical reason why 
Newtonian and other classical principles should be valid outside the 
domains in which they have been experimentally verified has come 
the realization that departures from these principles are, indeed 
necessary. Such departures find their expression through the intro- 
duction of new mathematical formalisms, new schemes of axioms 
and rules of manipulation, into the methods of theoretical physios. 

. Quantum mechanics provides a good example of the new ideas. It 
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requires the states of a dynamical system and the dynamioal variables 
to be interconnected in quite strange ways that are unintelligi ble 
from the classical standpoint. The states and dynamioal variables 
have to be represented by mathematical quantities of different 
natures from those ordinarily used in physios. The new scheme 
becomes a precise physical theory when all the axioms and rules of 
manipulation governing the mathematical quantities are specified 
and when in addition certain laws are laid down connecting physical 
facts with the mathematical formalism, so that from any given 
physical conditions equations between the mathematical quantities 
may be inferred and vice versa. In an application of the theory one 
would be given certain physical information, which one would pro- 
ceed to express by equations between the mathematical quantities. 
One would then deduce new equations with the help of the axioms 
and rules of manipulation and would conclude by interpreting these 
new equations as physical conditions. The justification for the whole 
scheme depends, apart from internal consistency, on the agreement 
of the final results with experiment. 

We shall begin to set up the scheme by dealing with the mathe- 
matical relations between the states of a dynamical system at one 
instant of time, which relations will come from the mathematical 
formulation of the principle of superposition. The superposition pro- 
cess is a kind of additive process and implies that states can in some 
way be added to give new states. The states must therefore be con- 
nected with mathematical quantities of a kind which can be added 
together to give other quantities of the same kind. The most obvious 
of such quantities are vectors. Ordinary vectors, existing in a space 
of a finite number of dimensions, are not sufficiently general for 
most of the dynamical systems in quantum mechanics. We have to 
make a generalization to vectors in a space of an infinite number of 
dimensions, and the mathematical treatment becomes complicated 
by questions of convergence. For the present* however, we shall deal 
merely with some general properties of the vectors, properties which 
can be deduced on the basis of a simple scheme of axioms, and 
questions of convergence and related topics will not be gone into 
until the need arises. 

It is desirable to have a special name for describing the vectors 
which are connected with the states of a system in quantum mecha- 
nics, whether they are in a space of a finite or an infinite number of 
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dimensions. We shall call them ket vectors, or simply kets , and denote 
a general one of them by a special symbol |>. If we want to specify 
a particular one of them by a label, A say, we insert it in the middle, 
thus |A>. The suitability of this notation will become clear as the 
developed. 

^d^vectors may be multiplied by oomplex numbers and may be 
added together to give other ket vectors, *e.g. from two ket vectors 
\A} and | JB> we can form 

Ci l^>+c a |JJ> = |U>, (1) 

say, where c* and c a are any two complex numbers. We may jiao 
perform more general linear processes with them, such as adding an 
infinite sequence of them, and if we have a ket vector |a;>, depending 
on and labelled by a parameter x which can take on all values in a 
certain range, me may integrate it with respect to 2 , to get another 

ket vector * - 

j \xyfa = |C> 


say. A ket vector which is expressible linearly in terms of certain 
Others is said to be dependent on them. A set of ket vectors are called 
independent if no one of them is expressible linearly in terms of the 
others. * 

We* now assume that each state of a dynamical system at a particular 
time corresponds to a ket vector, the correspondence being svph that if a 
state results from the superposition of certain other states, its correspond- 
ing ket vector is expressible linearly in terms of the corresponding ket 
vectors of the other states, and conversely. Thus the state B results from 
a superposition of the states A and B when the corresponding ket 
vectors are connected by (I). 

.The abpve assumption leads to oertain properties of the super- 
position prooess, properties which are in fact necessary for the word 
* superposition’ to be appropriate. When two or more states are 
superposed, the order in which they occur in the superposition 
prooess is unimportant, so the superposition process is symmetrical 
between the states that are superposed. Again, we see from equation 
(1) that (excluding the case when the coefficient Cj or c t is zero) if 
the state B can be formed by superposition of the states A and B, 
then the state A can be formed by superposition of B and B, and B 
can be formed by superposition of A and B. The superposition 
rektfamship is symmetrical between all three states A, B, a ad B. 
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A state whioh results from the superposition of certain other 
states will be said to be dependent on those states. More generally, 
a state will be said to be dependent on any set of states, finite or 
infinite in number, if its corresponding ket vector is dependent on 
the corresponding ket veotors of the set of states. A set qj^^^s 
will be called independent if no one of them is dependent WBS&k 
others. 

To proceed with the mathematical formulation of the superposijion 
principle we must introduce a further assumption, namely the assump- 
tion that by superposing a state with itself we cannot form any new 
state, but only the original state over again. If the original state 
corresponds to the ket vector \A}, when it is superposed with itself 
the resulting state will correspond to 

c x |A>+c a | Ay = (c x +c a )|A>, * 

where c x and c a are numbers. Now we may have c x +c a = 0, in which 
case the result of the superposition process would be nothing at all, 
the two components having cancelled each other by an interference 
effect. Our new assumption requires that, apart from this special 
case, the resulting state must be the same as the original one, so that 
(Ci+c a )|A> must correspond to the same state that \A} does. Now 
c i+c t is an arbitrary complex number and hence we can conclude 
that if the ket vector corresponding to a state is multiplied by any 
complex number , not zero , the resulting ket vector will correspond to the 
same elate. Thus a state is specified by the direction of a ket vector 
and any' length one may assign to the ket vector is irrelevant. All 
the states of the dynamical system are in one-one correspondence 
with all the possible directions for a ket vector, no distinction being 
made between the directions of the ket veotors \A} and — \A). 

The assumption just made shows up very clearly the fundamental 
difference between the superposition of the quantum theory and any 
kind of classical superposition. In the case of a classical system for 
whioh a superposition principle holds, for instance a vibrating mem- 
brane, when one superposes a state with itself the result is a different 
state, with a different magnitude of the oscillations. There is no 
physioal characteristic of a quantum state corresponding to the 
magnitude of the classical oscillations, as distinct from their quality, 
described by the ratios of the amplitudes at different points of 
the membrane. Again, while there exists a classical state with sero 
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amplitude of oscillation everywhere, namely the state of rest, there 
does not exist any corresponding state for a quantum system, the 
zero ket vector corresponding to no state at all. 

Given two states corresponding to the ket vectors |A> and | B>, 
the general state formed by superposing them corresponds to a ket 
vector |J3) which is determined by two complex numbers, namely 
the coefficients and c 8 of equation (1). If these two coefficients are 
multiplied by the same factor (itself a complex number), the ket 
vector 1 2?) will get multiplied by this factor and the corresponding 
state will be unaltered. Thus only the ratio of the two coefficients 
is effective in determining the state R. Hence this state is deter- 
mined by one complex number, or by two real parameters. Thus 
from two given states, a twofold infinity of states may be obtained 
by superposition. 

This result is confirmed by the examples discussed in §§ 2 and 3. 
In the example of § 2 there are just two independent states of polari- 
zation for a photon, which may be taken to be the states of plane 
polarization parallel and perpendicular to some fixed direction, and 
from the superposition of these two a twofold infinity of states of 
polarization can be obtained, namely all the states of elliptic polari- 
zation, the general one of which requires two parameters to describe 
it. Again, in the example of § 3, from the superposition of two given 
states of motion for a photon a twofold infinity of states of motion 
may be obtained, the general one of which is described by two 
parameters, which may be taken to be the ratio of the amplitudes 
of the two wave functions that are added together and their phase 
relationship. This confirmation shows the need for allowing complex 
coefficients in equation (1). If these coefficients were restricted to be 
real, then, since only their ratio is of importance for determining the 
direction of the resultant ket vector |i2> when \A} and |B> are 
given, there would be only a simple infinity of states obtainable from 
the superposition. 

6. Bra and ket vectors 

Whenever we have a set of vectors in any mathematical theory, 
we can alwayB set up a second set of vectors, which mathematicians 
call the dual vectors. The procedure will be described for the case 
when the original vectors are our ket vectors. 

Suppose we have a number <f> which is a function of a ket vector 
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|4> , i.e.' to each ket vector |A> there corresponds one number 
and suppose further that the function is a linear one, which means 
that the number corresponding to \Ay-\-\A') is the sum of the 
numbers corresponding to | A} and to |A'>, and the number corre- 
sponding to c|A> is c times the number corresponding to |4X c 
being any numerical factor. Then the number <f> corresponding to 
any | A) may be looked upon as the scalar product of that | A) with 
some new vector, there being one of these new vectors for each linear 
function of the ket vectors | A). The justification for this way of 
looking at <f> is that, as will be seen later (see equations (5) and (6)), 
the new vectors may be added together and may be multiplied by 
numbers to give other vectors of the same kind. The new veotors 
are, of course, defined only to the extent that their scalar products 
with the original ket vectors are given numbers, but this is suffi- 
cient for one to be able to build up a mathematical theory about 
them. 

We shall call the new vectors bra vectors , or simply bras , and denote 
a general one of them by the symbol <|, the mirror image of the 
symbol for a ket vector. If we want to specify a particular one of 
them by a label, B say, we write it in the middle, thus <U|. The 
scalar product of a bra vector <B| and a ket vector |4> will be 
written <2?|A>, i.e. as a juxtaposition of the symbols for the bra 
and ket vectors, that for the bra vector being on the left, and the 
two vertical lines being contracted to one for brevity. 

One may look upon the symbols < and > as a distinctive kind of 
brackets. A scalar product <jB|A> now appears as a complete bracket 
expression and a bra vector <J5| or a ket vector |A> as an incomplete 
bracket expression. We have the rules that any complete bracket 
expression denotes a number and any incomplete bracket expression 
denotes a vector ^ofthe bra or ket kind according to whether it contains, 
the first or second part of the brackets. 

The condition that the scalar product of <2?| and |A> is a linear 
function of |A> may be expressed symbolically by 

<B\{\A >+ 1 A'» = CB|A>+<2*|A'>, (2) 

<B\{c\A>} - c<B|A>, (3) 

c being any number. 

A bra vector is considered to be completely defined when its scalar 
product with every ket vector is given, so that if a bra vector has its 
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scalar product with every ket vector vanishing, the bra vector itself 
must be considered as vanishing. In symbols, if 

<P|4> = 0, all | Ay, } 

then <P| = 0. / 

The sum of two bra vectors <P | and (B' | is defined by the condition 
that its scalar product with any ket vector \Ay is the sum of the 
scalar products of <2?| and <i?'| with | A}, 

{<B\+<B'\}\A> = <B\Ay+<B'\Ay, ( 5 ) 

and the product of a bra vector <jB| and a number c is defined by the 
condition that its scalar product with any ket vector | A} is c times 
the scalar product of (B | with \Ay, 

{c(B\}\Ay = c(B\Ay (6) 


Equations (2) and (5) show that products of bra and ket vectors 
satisfy the distributive axiom of multiplication, and equations (3) 
and (6) show that multiplication by numerical factors satisfies the 
usual algebraic axioms. 

The bra vectors, as they have been here introduced, are quite a 
diff erent kind of vector from the kets, and so far there is no connexion 
between them except for the existence of a scalar product of a bra 
and a ket. We now make the assumption that there is a one-one 
correspondence between the bras and the kets, such that the bra corre- 
sponding to |^4>+|^1'> is the sum of the bras corresponding to \Ay and 
to \A'y, and the bra corresponding to c\Ay is c times the bra corre- 
sponding to \Ay, 5 being the conjugate complex number to c. We shall 
use the same label to specify a ket and the Corresponding bra. Thus 
the bra corresponding to | Ay will be written C4|. 

The relationship between a ket vector and the corresponding bra 
makes it reasonable to call one of them the conjugate imaginary of 
the other. Our bra and ket vectors are complex quantities, since they 
can be multiplied by complex numbers and are then of the same 
nature as before, but they are complex quantities of a special kind 
which cannot be split up into real and pure imaginary parts. The 
usual method of getting the real part of a complex quantity, by 
taking half the sum of the quantity itself and its conjugate, cannot 
be applied since a bra and a ket vector are of different natures and 
cannot be added together. To call attention to this distinction, we 
shall use the words ( conjugate complex’ to refer to numbers and 
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other oomplex quantities which can be split up into real and pure 
imaginary parts, and the words 'conjugate imaginary* for bra and 
ket vectors, which cannot. With the former kind of quantity, we 
shall use the notation of putting a bar over one of them to get the 
conjugate complex one. 

On account of the one-one correspondence between bra vectors and 
ket vectors, any state of our dynamical system at a particular time may 
be specified by the direction of a bra vector just as well as by the direction 
of a ket vector. In fact the whole theory will be symmetrical in its 
essentials between bras and kets. 

Given any two ket vectors | Ay and \B), we can construct from 
them a number (B\A} by taking the scalar product of the first with 
the conjugate imaginary of the second. This number depends linearly 
on \A} and antilinearly on | By, the antilinear dependence meaning 
that the number formed from |.B>+ \B') is the sum of the numbers 
formed from |i?> and from |U'>, and the number formed from c|JB> 
is c times the number formed from |J5). There is a second way in 
which we can construct a number which depends linearly on \A) and 
antilinearly on | By, namely by forming the scalar produot of |2J> 
with the conjugate imaginary of \Ay and taking the conjugate com- 
plex of this scalar product. We assume that these two numbers are 


always equal , i.e. 




( 7 ) 


Putting |2?> = | Ay here, we find that the number (A \A} must be 
real. We make the further assumption 


« < 4 | 4 >> 0 , ( 8 ) 

except when |.4> = 0. 

In ordinary space, from any two vectors one can construct a 
number — their scalar product — which is a real number and is sym- 
metrical between them. In the space of bra vectors or the space of 
ket vectors, from any two vectors one can again construct a number 
— the scalar product of one with the conjugate imaginary of the 
other — but this number is oomplex and goes over into the conjugate 
complex number when the two vectors are interchanged. There is 
thus a kind of perpendicularity in these spaoes, which is a generaliza- 
tion of the perpendicularity in ordinary space. We shall call a bra 
and a ket vector orthogonal if their soalar product is zero, and two 
braB or two kets will be called orthogonal if the scalar product of one 
with the conjugate imaginary of the other is zero. Further, we sha ll 
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say that two states of our dynamical system are orthogonal if the 
vectors corresponding to these states are orthogonal. 

The length of a bra vector {A | or of the conjugate imaginary ket 
vector \Ay is defined as the square root of the positive number 
(A | Ay. When we are given a state and wish to set up a bra or ket 
vector to correspond to it, only the direction of the vector is given 
and the vector itself is undetermined to the extent of an arbitrary 
numerical factor. It is often convenient to choose this numerical 
factor so that the vector is of length unity. This procedure is called 
normalization and the vector so chosen is said to be normalized. The 
vector is not completely determined even then, since one can still 
multiply it by any number of modulus unity, i.e. any number e im ? 
where y is real, without changing its length. We shall call such a 
number a phase factor. 

The foregoing assumptions give the complete soheme of relations 
between the states of a dynamical system at a particular time. The 
relations appear in mathematical form, but they imply physical 
conditions, which will lead to results expressible in terms of observa- 
tions when the theory is developed further. For instance, if two states 
are orthogonal, it means at present simply a certain equation in our 
formalism, but this equation implies a definite physioal relationship 
between the states, which further developments oi the theory will 
enable us to interpret in terms of observational results (see the 
bottom of p. 35). 
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7. Linear operators 

In the preceding section we considered a number which is a linear 
function of a ket vector, and this led to the concept of a bra vector. 
We shall now consider a ket vector which is a linear function of a 
ket vector, and this will lead to the concept of a linear operator. 

Suppose we have a ket | F} which is a function of a ket |A>, i.e. 
to each ket |A> there corresponds one ket \Fy, and suppose further 
that the function is a linear one, which means that the |jF> corre- 
sponding to | A}+ \A'y is the sum of the \FYb corresponding to |A> 
and to | A’y, and the \F} corresponding to c|A> is c times the |^*> 
corresponding to | A), c being any numerical factor. Under these 
conditions, we may look upon the passage from |A> to |.F> as the 
application of a linear operator to | Ay. Introducing the symbol a 
for the linear operator, we may write 

\F> = *\A\ 

in whioh the result of a operating on |A> is written like a product 
of ol with \Ay. We make the rule that in such products the ket vector 
must always he tyut on the right of the linear operator . The above 
conditions of linearity may now be expressed by the equations 

ot{\A>+\A'y} = *\Ay+*\A'> 9 
a{c|A>} = coc\Ay. 

A linear operator is considered to be completely defined when the 
result of its application to every ket vector is given. Thus a linear 
operator is to be considered zero if the result of its application to every 
ket vanishes, and two linear operators are to be considered equal if 
they produce the same result when applied to every ket. 

Linear operators can be added together, the sum of two linear 
operators being defined to be that linear operator which, operating 
on any ket, produces the sum of what the two linear operators 
separately would produce. Thus a+jS is defined by 

{a+0}|A> = a|A>+/}|A> (2) 

for any |A>. Equation (2) and the first of equations (1) show that 
products of linear operators with ket vectors satisfy th6 distributive 
axiom of multiplication. 
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Linear operators can also be multiplied together, the product of 
two linear operators being defined as that linear operator, the appli- 
cation of which to any ket produces the same result as the application 
of the two linear operators successively. Thus the product <xj3 is 
defined as the linear operator which, operating on any ket I A), 
changes it into that ket which one would get by operating first on 
\Ay with /?, and then on the result of the first operation with at. In 


symbols 


m = «pi Ay}. 


This definition appears as the associative axiom of multiplication for 
the triple product of a, and \A}, and allows us to write this triple 
product as ofi\ Ay without brackets. However, this triple product is 
in general not the same as what we should get if we operated on | Ay 
first with a and then with /?, i.e. in general ofi\ Ay differs from j8a|A>, 
so that in general oj3 must differ from j9a. The commiUcUive axiom of 
multiplication does not hold for linear operators . It may happen as a 
special case that two linear operators £ and rj are such that £17 and 
rj( are equal. In this case we say that £ commutes with rj , or that £ 
and rj commute. 

By repeated applications of the above processes of adding and 
multiplying linear operators, one can form sums and products of 
more than two of them, and one can proceed to build up an algebra 
with them. In this algebra the commutative axiom of multiplication 
does not hold, and also the product of two linear operators may 
vanish without either factor vanishing. But all the other axioms of 
ordinary algebra, including the associative and distributive axioms 
of multiplication, are valid, as may easily be verified. 

If we take a number k and multiply it into ket vectors, it appears 
as a linear operator operating on ket vectors, the conditions ( 1 ) being 
fulfilled with k substituted for a. A number is thus a special case of 
a linear operator. It has the property that it commutes with all linear 
operators and this property distinguishes it from a general linear 
operator. 

So far we have considered linear operators operating only on ket 
vectors. We can give a meaning to their operating also on bra vectors, 
in the following way. Take the scalar product of any bra <2?| with 
the ket ot\Ay. This scalar product is a number which depends 
linearly on |A> and therefore, from the definition of bras, it may be 
considered as the scalar produot of | Ay with some bra. The bra thus 
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defined depends linearly on < B | , so we may look upon it as the result of 
some linear operator applied to <P 1 . This linear operator is uniquely 
determined by the original linear operator a and may reasonably be 
Galled the same linear operator operating on a bra. In this way our 
linear operators are made capable of operating on bra vectors. 

A suitable notation to use for the resulting bra when a operates on 
the bra <P| is <2?|a, as in this notation the equation whioh defines 

<B| ““ {CB|«}|-4> = <S|{a|4» (3) 

for any \A}, whioh simply expresses the associative axiom of multi- 
plication for the triple product of <P|, a, and \A}. We therefore 
make the general rule that in a product of a bra and a linear operator, 
the bra must always be put on the left. We can now write the triple 
product of <J3|, a, and \A} simply as <J3|a|A> without brackets. It 
may easily be verified that the distributive axiom of multiplication 
holds for products of bras and linear operators just as well as for 
products of linear operators and kets. 

There is one further kind of product whioh has a meaning in our 
scheme, namely the product of a ket vector and a bra veotor with 
the ket on the left, such as |A><2?|. To examine this product, let us 
multiply it into an arbitrary ket |P>, putting the ket on the right, 
and assume the associative axiom of multiplication. The product is 
then |A><jB|P>, which is another ket, namely |A> multiplied by the 
number <P|P>, and this ket depends linearly on the ket |P>. Thus 
]A><P| appears as a linear operator that can operate on kets. It 
can also operate on bras, its product with a bra (Q | on the left being 
<Q|A><2?|, which is the number <Q|A> times the bra <J3|. The 
product |A><P| is to be sharply distinguished from the product 
<P|A> of the same factors in the reverse order, the latter product 
being, of course, a number. 

We now have a complete algebraic scheme involving three kinds 
of quantities, bra vectors, ket vectors, and linear operators. They can 
be multiplied together in the various ways discussed above, and the 
associative and distributive axioms of multiplication always hold, 
but the commutative axiom of multiplication does not hold. In this 
general scheme we still have the rules of notation of the preceding 
section, that any oomplete bracket expression, containing < on the 
left and > on the right, denotes a number, while any incomplete 
bracket expression, containing only < or >, denotes a veotor. 
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With regard to the physical significance of the scheme, we have 
already assumed that the bra vectors and ket vectors, or rather the 
directions of these vectors, correspond to the states of a dynamical 
system at a particular time. We now make the further assumption 
[that the linear operators correspond to the dynamical variables at that 
j time. By dynamical variables are meant quantities such as the 
coordinates and the components of velocity, momentum and angular 
momentum of particles, and functions of these quantities — in fact 
the variables in terms of which classical mechanics is built up. The 
new assumption requires that these quantities shall occur also in 
quantum mechanics, but with the striking difference that they are 
now subject to an algebra in which the commutative axiom of multiplica- 
tion does not hold. 

This different algebra for the dynamical variables is one of the 
most important ways in which quantum mechanics differs from 
classical mechanics. We shall see later on that, in spite of this funda- 
mental difference, the dynamical variables of quantum mechanics 
still have many properties in common with their classical counter- 
parts and it will be possible to build up a theory of them closely 
analogous to the classical theory and forming a beautiful generaliza- 
tion of it. 

It is convenient to use the same letter to denote a dynamical 
variable and the corresponding linear operator. In fact, we may con- 
sider a dynamical variable and the corresponding linear operator to 
be both the same thing, without getting into confusion. 

8. Conjugate relations 

Our linear operators are complex quantities, since one can multiply 
them by oomplex numbers and get other quantities of the same nature. 
Hence they must correspond in general to complex dynamical vari- 
ables, i.e. to complex functions of the coordinates, velocities, etc. We 
need some further development of the theory to see what kind of 
linear operator corresponds to a real dynamical variable. 

Consider the ket whioh is the conjugate imaginary of <P|a. This 
ket depends antilinearly on <P| and thus depends linearly on |P>. 
It may therefore be considered as the result of some linear operator 
operating on |P>. This linear operator is called the adjoint of at and 
we shall denote it by 5. With this notation, the conjugate imaginary 
of <P|«is«|P>. 
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In formula (7) of Chapter I put <P|a for (A\ and its conjugate 
imag inar y a|P> for | Ay. The result is 

<P|fi|P> = <P|a|P>. (4) 

This is a general formula holding for any ket vectors |2?>, |P> and 
any linear operator a, and it expresses one of the most frequently 
used properties of the adjoint. 

Putting a for a in (4), we get 

<P|S|P> = (W> = <B\*\P>, 

by using (4) again with |P> and |J3> interchanged. This holds for 
any ket |P>, so we can infer from (4) of Chapter I, 

<P|a = <P|a, 

and since this holds for any bra vector <P|, we can infer 

a = a. 

Thus the adjoint of the adjoint of a linear operator is the original linear ] 
operator. This property of the adjoint makes it like the conjugate 
complex of a number, and it is easily verified that in the special case 
when the linear operator is a number, the adjoint linear operator is 
the conjugate complex number. Thus it is reasonable to assume that 
| the adjoint of a linear operator corresponds to the conjugate complex of 
j a dynamical variable . With this physical significance for the adjoint 
of a linear operator, we may call the adjoint alternatively the con- 
jugate complex linear operator , which conforms with our notation a. 

A linear operator may equal its adjoint, and is then called self- 
adjoint. It corresponds to a real dynamical variable, so it may be 
called alternatively a real M near operator . Any linear operator may 
be split up into a real part and a pure imaginary part. For this 
reason the words * conjugate complex’ are applicable to linear 
operators and not the words ‘conjugate imaginary’. 

The conjugate complex of the sum of two linear operators is 
obviously the sum of their conjugate complexes. To get the conjugate 
complex of the product of two linear operators a and /?, we apply 
formula (7) of Chapter I with 

<A\ = <P|«, <B\ = «2|/5, 

so that |A> ass a|P>, |P> — j3| Q>. 

The result is ' 


<01/5a|P> = <P|<tf|0> = <Gltf|P> 
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from ( 4 ). Since this holds for any |P> and <Q|, we can infer that 

pOL = Otp. ( 5 ) 

Thus the conjugate complex of the product of two linear operators equals 
the product of the conjugate complexes of the factors in the reverse order. 

. As simple examples of this result, it should be noted that, if £ and 
7 ) are real, in general £77 is not real. This is an important difference 
'from classical mechanics. However, £^+^£ is real, and so is t(£i?— -gf). 
^Only when £ and 77 commute is £77 itself also real. Further, if £ is real, 
then so is £* and, more generally, £ n with n any positive integer. 

We may get the conjugate complex of the product of three linear 
operators by successive applications of the rule ( 5 ) for the conjugate 
complex of the product of two of them. We have 

ofiy = oc(Py) = ftyai = yfiaL, ( 0 ) 

so the conjugate complex of the product of three linear operators 
equals the product of the conjugate complexes of the factors in the 
reverse order. The rule may easily be extended to the product of any 
number of linear operators. 

In the preceding section we saw that the product |A><J 5 | is a linear 
operator. We may get its conjugate complex by referring directly to 
the definition of the adjoint. Multiplying \A'){B\ into a general bra 
<P| we get <P|A><JB|, whose conjugate imaginary ket is 

<Pl!>|P> = <A|P>|P> = |P><A|P>. 

Hence | 4 ><H| = |B>< 4 |. (7)| 

We now have several rules concerning conjugate complexes and 
conjugate imaginaries of products, namely equation (7) of Chapter I, 
equations ( 4 ), ( 5 ), (6), ( 7 ) of this chapter, and the rule that the 
conjugate imaginary of <P|a is a|P>. These rules can all be summed 
up in a single comprehensive rule; the conjugate complex or conjugate 
j imaginary of any product of bra vectors , ket vectors , and linear operators 
is obtained by taking the conjugate complex or conjugate imaginary of 
each factor and reversing the order of all the factors. The rule is easily 
verified Id hold quite generally, also for the oases not explicitly given 
above. 

Theorem. If £ is a reed linear operator and 

£"|P> - 0 

for a particular ket |P>, m being a positive integer , then 

(\P> - 0. 


( 8 ) 



{ 8 CONJUGATE RELATIONS 29 

To prove the theorem, take first the oase when m — 2. Equation 
(8) then gives <P|f*|P> =' 0, ’ 

showing that the ket f |P> multiplied by the oonjugate imaginary bra 
<P|f is zero. From the assumption (8) of Chapter I with £|P> for | A>, 
we see that £|P> must be zero. Thus the theorem is proved for m = 2. 
Now take m > 2 and put 

M P> = \Q>. 

Equation (8) now gives £*\Q} — 0. 

Applying the theorem for m = 2, we get 

= 0 

or f 11 - 1 ! P> = 0. (9) 

By repeating the process by which equation (9) is obtained from 
(8), we obtain successively 

f m- , |P> = 0 , £*-»|P> = 0, ..., £*|P> = 0, f |P> = 0, 

and so the theorem is proved generally. 

9. Eigenvalues and eigenvectors 
We must make a further development of the theory of linear 
operators, consisting in studying the equation 

«|P> = «|P>, (10) 

where a is a linear operator and a is a number. This equation usually 
presents itself in the form that at is a known linear operator and the 
num,ber a and the ket |P> are unknowns, which we have to try to 
choose so as to satisfy (10), ignoring the trivial solution |P> = 0. 

Equation (10) means that the linear operator a applied to the ket 
| P> just multiplies this ket by a numerical factor without changing 
its direction, or else multiplies it by the factor zero, so that it ceases 
to have a direction. This same a applied to other kets will, of course, 
in general Change both their lengths and their directions. It should 
be noticed that only the direction of |P> is of importance in equation 
(10). If one multiplies |P> by any number not zero, it will not affect 
the question of whether (10) js satisfied or not. 

Together with equation (10), we should consider also the conjugate 
imaginary form of equation 

<Q\*-b<Q\, (11) 

where b is a number. Here the unknowns are the number 6 and the 
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non-zero bra (Q |. Equations (10) and (11) are of such fundamental 
importance in the theory that it is desirable to have some special 
words to describe the relationships between the quantities involved. 
If (10) is satisfied, we shall call a an eigenvalu e^ of the linear operator 
a, or of the corresponding dynamical variable, and we shall call |P> 
an eigenket of the linear operator or dynamical variable. Further, we 
shall say that the eigenket |P> belongs to the eigenvalue a. Similarly, 
if (11) is satisfied, we shall call b an eigenvalue of a and <Q| an 
eigenbra belonging to this eigenvalue. The words eigenvalue, eigen- 
ket, eigenbra have a meaning, of course, only with reference to a linear 
operator or dynamical variable. 

Using this terminology, we can assert that, if an eigenket of a is 
multiplied by any number not zero, the resulting ket is also an 
eigenket and belongs to the same eigenvalue as the original one. 
It is possifife to have two or more independent eigenkets of a linear 
operator belonging to the same eigenvalue of that linear operator, 
e.g. equation (10) may have several solutions, |P1>, |P2>, |P3>,... say, 
all holding for the same value of a, with the various eigenkets |P1>, 
|P2>, |P3>,... independent. In this case it is evident that any linear 
combination of the eigenkets is another eigenket belonging to the 
same eigenvalue of the linear operator, e.g. 

Cl |Pl>+c a |P2>+c a |P3>+... 

is another solution of (10), where c 1 ,c 2 ,c 8> ... are any numbers. 

In the special case when the linear operator a of equations (10) and 
(11?) is a number, k say, it is obvious that any ket |P> and bra <Q| 
will satisfy these equations provided a and b equal k. Thus a number 
considered as a linear operator has just one eigenvalue, and any ket 
j is an eigenket and aijp bra is an eigenbra, belonging to this eigenvalue. 

The theory of eigenvalues and eigenvectors of a linear operator a 
which is not real is not of much use for quantum mechanics. We 
shall therefore oonfine ouiselves to real linear operators for the further 
development of the theory. Putting for a the real linear operator £, 
we have instead of equations (10) and (11) 

C\ P> = *\P>, (12) 

<«lf-6<Q|. (13) 

t The word 'proper * is sometimes used instead of ' eigen but this is not satisfactory 
jts the words 'proper* and 'improper* are often used with other meanings. For example, 
m f§ 15 and 46 the words 'improper function' and 'proper-energy* cue used. 
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Three important results can now be readily deduoed. 

/(i) The eigenvalues are all real numbers. To prove that a satisfying 
(12) is real, we multiply (12) by the bra <P| on the left, obtaining 

<P|||P> = a<P|P>. 

Now from equation (4) with <jB| replaced by <P| and a replaced by 
the real linear operator f, we see that the number <P|£|P> must be 
real, and from (8) of § 6, <P|P> must be real and not zero. Hence a 
is real. Similarly, by multiplying (13) by \Q) on the right, we oan 
prove that b is real. 

Suppose we have a solution of (12) and we form the conjugate 
imaginary equation, which will read 

<P|f = *<P\ 

in view of the reality of £ and a. This conjugate imaginary equation 
now provides a solution of (13), with (Q\ = <P| and b — a. Thus 
we can infer 

(ii) The eigenvalues associated with eigenkets are the same as the 
eigenvalues associated with eigenbras. 

(iii) The conjugate imaginary of any eigenket is an eigenbra belonging 
to the same eigenvalue , and conversely . This last result makes it reason- 
able to call the state corresponding to any eigenket or to the conjugate 
imaginary eigenbra an eigenstate of the real dynamical variable £. 

Eigenvalues and eigenvectors of various real dynamical variables 
are used very extensively in quantum mechanics, so it is desirable 
to have some systematic notation for labelling them. The following 
is suitable for most purposes. If £ is a real dynamical variable, we 
call its eigenvalues £', etc. Thus we have a letter by itself r 

denoting a real dynamical variable or a real Unear operator , and the 
same letter with primes or an index attached denoting a number , 
namely an eigenvalue of what the letter by itself denotes. An eigen- 
vector may now be labelled by the eigenvalue to which it belongs. 
Thus |£') denotes an eigenket belonging to the eigenvalue £' of the 
dynamical variable £, If in a piece of work we deal with more than 
one eigenket belonging to the same eigenvalue of a dynamical variable, 
we may distinguish them one frpm another by means of a further 
label, or possibly of more than one further labels. Thus, if we are 
dealing with two eigenkets belonging to the same eigenvalue of 
we may call them |f'l> and |£'2>. 
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/Theorem. Two eigenvectors of a real dynamical variable belonging 
to different eigenvalues are orthogonal / 

To prove the theorem, let |£'> and |£*> be two eigenkets of the real 
dynamioal variable £, belonging to the eigenvalues £' and £" respec- 
tively. Then we have the equations 

fID = Pitt, (14) 

f \r> = nr>. us) 

Taking the conjugate imaginary of (14), we get 

<£'l£ = f<£'|. 

Multiplying this by |£ # > on the right gives 

<ri*ir> = e<£\r> 

and multiplying (15) by <£'| on the left gives 

<e\t\r> = r<f ir>. 

Hence, subtracting, (£'— £*)<£'!£*> = 0, (16) 

showing that, if £' ^ £ # , <£'|£"> = 0 an( l the two eigenvectors |£'> 
and |£ # > are orthogonal. This theorem will be referred to as the 
orthogonality theorem . 

We have been discussing properties of the eigenvalues and eigen- 
vectors of a real linear operator, but have not yet considered the 
question of whether, for a given real linear operator, any eigenvalues 
and eigenvectors exist, and if so, how to find them. This question 
is in general very difficult to answer. There is one useful special case, 
however, which is quite tractable, namely when the real linear 
operator, £ say, satisfies an algebraic equation 

*tf) sa £*+^£"- 1 +a a £"~ a +...+a n = 0, (17) 

the coefficients a being numbers. This equation means, of oourse, 
that the linear operator ^(£) produces the result zero when applied 
to any ket vector or to any bra vector. 

Let (17) be the si mplest algebraic equation that £ satisfies. Then 
it will be shown that 

/)«) The number of eigenvalues of £ is n. 

MjS) There are so many eigenkets of £ that any ket whatever can 
be expressed as a sum of suoh eigenkets. 
r The algebraic form can be factorized into n linear faotor?, tfaf 
” 8U ^ be ^. m=*(£-c 1 )((-c,)((-c t )...(£-e n ) " (18) 
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say, the c’s being numbers, not assumed to be all different. This 
factorization can be performed with £ a linear operator just as well 
as with £ an ordinary algebraic variable, since there is nothing 
occurring in (18) that does not commute with £. Let the quotient 
when $£) is divided by (£— c r ) be &.(£), so that 

4>(S) = (£ - <V)xr(£) = lr 2, 3,..., ft). 

Then, for any ket |P>, 

(f“C r ) Xr (f)|P> = ^(£)|P> = 0. (19) 

Now &.(£)|P> cannot vanish for every ket |P>, as otherwise xtf) 
itself would vanish and we should have £ satisfying an algebraic 
equation of degree ft— 1, which would contradict the assumption that 
(17) is the simplest equation that £ satisfies. If we choose |P> so that 
X r (£)|P> -does not vanish, then equation (19) shows that Xr(£)|P> is 
an eigenket of £, belonging to the eigenvalue c r . The argument holds 
for each value of r from 1 to ft, and hence each of the c’s is an eigen- 
value of £. No other number can be an eigenvalue of £, sinoe if £' is 
any eigenvalue, belonging to an eigenket |£'>, 

t\e > - m 

and we can deduce ^(£)|£'> = ^(£')|£'>, 

and since the left-hand side vanishes we must have </>((') = 0. 

To complete the proof of (a) we must verify that the c’s are all 
different. Suppose the c’s are not all different and c a oodajs*» times 
say, with m > 1. Then ^(£) is of the form 

^(£) = (£— c,) m 0(£), 

with 0(£) a rational integral function of £. Equation (17) now gives us 
(£-c f )«0(£)|4> = 0 (20) 

for any ket |.4>. Sinoe c a is an eigenvalue of £ it must be real, so that 
£— is a real linear operator. Equation (20) is now of the same form 
as equation (8) with £— c # for £ and 0(£) |A> for |P>. From the theorem 
connected with equation (8) we can infer that 

(£-c,)0(£)|4> = 0. 

Sinoe the ket \A} is arbitrary, 

o, 

which contradicts the assumption that (17) is the simplest equation 
thatif satisfies. Henoe the c’s are all different and (a) is proved. 

Let Xr(c f ) be the number obtained when c f is substituted for £ in 
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algebraic expression &.(£). Since the c’s are all different, Xr( c r) 
cannot vanish. Consider now the expression 



[f c 9 is substituted for £ here, every term in the sum vanishes except 
he one for which r = 8, since Xr(() contains (£ — c a ) as a factor when 
f and the term for whioh r = 8 is unity, so the whole expression 
vanishes. Thus the expression (21) vanishes when £ is put equal to 
any of the n numbers c 8 ,...,c n . Since, however, the expression 
is only of degree n— 1 in £, it must vanish identically. If we now 
apply the linear operator (21) to an arbitrary ket |P> and equate 
the result to zero, we get 




( 22 ) 


Each term in the sum on the right here is, according to (19), an 
eigenket of £, if it does not vanish. Equation (22) thus expresses the 
arbitrary ket |P> as a sum of eigenkets of £, and thus (fi) is proved. 

As a simple example we may consider a real linear operator a that 
satisfies the equation a 2 — l (23) 

Then a has the two eigenvalues 1 and — 1. Any ket |P> can be 
expressed as |p> = i( i +ff )|p >+i(1 _ ff) | P> . 

It is easily verified that the two terms on the right here are eigenkets 
of cr, belonging to the eigenvalues 1 and — 1 respectively, when they 
do not vanish. 


10e Observables 

We have made a number of assumptions about the way in whioh 
states and d y nam i cal variables are to be represented mathematically 
in the theory.* These assumptions are not, by themselves, laws of 
nature, but become laws of nature when we make some farther 
assumptions that provide a physical interpretation of the theory. 
Such further assumptions must take the fisjtrm of establishing con- 
nexions between the results of observations, on one hand, and the 
equations of the mathematical formalism ok the other. 

When we make to ob ser v a tion we measure some' d ynamical variable. 
I t is obvious physically that the result of such a measurement must 
always bp a real number, so wo should expeot that any dynamina.! 
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f variable that we can measure must be a real dynamical variable., 
' One might think one could measure a complex dynamioal variable 
by measuring separately its real and pure imaginary parts. But this 
would involve two measurements of Wo observations, which would 
be all right in classical mechanics, but would not do m quantum 
mechanics, where two observations in general interfere with one 
another — it is not in general permissible to consider that two observa- 
tions can be made exaotly simultaneously, and if they are made in 
quick succession the first will usually disturb the state of the system 
and introduce an indeterminacy that will affeot the second. We 
therefore have to restrict the dynamical variables that we can 
measure to be real, the condition for this in quantum mechanics 
being as given in § 8. Not every real dynamical variable can be 
measured, however. A further restriction is needed, as we shall see 
later. 

We now make some assumptions for the physical interpretation of 
the theory. If the dynamical system is in an eigenstate of a real 
dynamical variable £, belonging to the eigenvalue £', then a measurement 
of wi ll certainly give as result the number Conversely, if the system 
is in a state such that a measurement of a real dynamical variable £ is 
certain to give one particular result (instead of giving one or other of. 
several possible results according to ^probability law, as is in genera l 
the case), then the stale is an eigenstate of £ and the resvll of the measure - 
ment is the eigenvalue of £ to which this eigenstate belongs. These 
assumptions are reasonable on account of the eigenvalues of real 
linear operators being always real numbers. 

Some of the immediate consequences of the assumptions will be 
noted. If we have two or more eigenstates of a real dynamical 
variable £ belonging to the same eigenvalue £', then any state 
formed by superposition of them will also be an eigenstate of £ 
belonging to the eigenvalue £'. We can infer that if we have two or 
more states for which a measurement of £ is certain to give the result 
£', then for any state formed by superposition of them a measurement 
of£ will still be oertain^p give the result^. This gives us some insight 
into the physical significance of superposition of states. Again, two 
eigenstates of £ belonging to different eigenvalues are orthogonal. 
We can infer that two states for which a measurement of £ is certain 
to give two different results are orthogonal. This gives us some 
insight into the physical significance^ orthogonal states. 
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When we measure a real dynamical variable £, the disturbance 
involved in the act of measurement causes a jump in the state of the 
dynamical system. From physical continuity, if we make a second 
measurement of the same dyiiamical variable £ immediately after 
the first, the result of the second measurement must be the same as 
that of the first. Thus after the first measurement has been made, 
there is no indeterminacy in the result of the second. Henoe, after 
the first measurement has been made, the system is in an eigenstate 
of the dynamical variable £, the eigenvalue it belongs to being equal 
to the result of the first measurement. This conclusion must still hold 
if the second measurement is not actually made. In th is way; we see 
t hat a measu rement always causes the system to jump into an eigen- 
st ate o f the dynamical variabTelhat is bein g m easured" tH a eigenvalue 
this eigenstate belongs to^ being equal to the result of the measure- 
ment^ ' 

We oan infer that, with the dynamical system in any state, any 
‘ of a measure 


values . Conversely, every eigenvalue is a possible result of a measure- 


ment of the dynamical variable for some state of the system , since it is 


certainly the result if the state is an eigenstate belonging to this 
eigenvalue. This gives us the physioal significance of eigenvalues. 
The set of eigenvalues of a real dynamical variable are just the 
possible results of measurements of that dynamical variable and the 
calculation of eigenvalues is for this reason an important problem. 

Another assumption we make connected with the physioal inter- 
pretation of the theory is that, if a certain real dynamical varia ble 
£ is measured with the system in a 'particular state . the states into wk 


the system may jump on account of the measurement are such that the 


state is dependent on them . Now these states into which 
the system may jump are all eigenstates of £, and hence the original 
state is dependent on eigenstates of £. But the original state may be 
any state, so we oan oondude that any stataas dependent on eigen- 
states of £. If we define a complete set of states to be a set such that 


__ them, then our conclusion can be formu - 

jateii— the eigensta te s of £ form a complete set. 

Not evexy real dynamical variable EassufficSnt eigenstates to form 
a complete stt. those whose eigenstates do hot form complete sets 
are not quantities that can be measured. We obtain in this way a 
further oondition that a dynamical variable has to satisfy in order 
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that it flhftil be susceptible to measurement, in addition to the con- 
dition that it shall be real. We call a real dynamical variable whose j 
eigenstates form a complete set an ob s ervable. Thus any quantity I 

f^nl>e measured is an observable^ ~ 

^ The question now presents itself— Can every observable be 
measured? The answer theoretically is yes. In practice it may be 
very awkward, or perhaps even beyond the ingenuity of the experi- 
menter, to devise an apparatus which could measure some particular 
observable, but the theory always allows one to imagine that the 
measurement can be made. 

Let us examine mathematically the condition for a real dynamical 
variable £ to be an observable. Its eigenvalues may consist of a 
(finite or infinite) discrete set of numbers, or alternatively, they 
may consist of all numbers in a certain range, such as all numbers 
lying between a and b. In the former ' case, the condition that 
any state is dependent on eigenstates of £ is that any ket can 
be expressed as a sum of eigenkets of £. In the latter case the 
condition needs modification, since one may have an integral instead 
of a sum, i.e. a ket |P> may be expressible as an integral of eigen- 


kets of £, 


\P> = J“ IO 


(24) 


|£'> being an eigenket of £ belonging to the eigenvalue £' and the 
range of integration being the range of eigenvalues, as such a ket is 
dependent on eigenkets of £. Not every ket dependent on eigenkets 
of £ can be expressed in the form of the right-hand side of (24), since 
one of the eigenkets itself cannot, and more generally any sum of 
eigenkets cannot. The condition for the eigenstates of £ to form a 
complete set must thus be formulated, that any ket |P> can be 
expressed as an integral plus a sum of eigenkets of £, i.e. 


\p> = j \?c>di'+2\ed>, (25) 

where the |£'c>, | fdy ice all eigenkets of £, the labels c and d being 
inserted to distinguish them when the eigenvalues £' and f* are equal,, 
and where the integral is taken over the whole range of eigenvalues 
and the sum is taken over any selection of them. If tfais condition 
is satisfied in the oase when the eigenvalues of £ consist of a range 
of numbers, then £ is an observable. 

There is a more general case that sometimes occurs, namely the 
eigenvalues of £ may consist of a range of numbers together with a 



38 


DYNAMICAL VARIABLES AND OBSERVABLES 


HO 

discrete set of numbers lying outside the range. In this case the 
condition that £ shall be an observable is still that any ket shall be 
expressible in the form of the right-hand side of (25), but the sum 
over r is now a sum over the discrete set of eigenvalues as well as a 
selection of those in the range. 

It is often very difficult to decide mathematically whether a par- 
ticular real dynamical variable satisfies the condition for being an 
observable or not, because the whole problem of finding eigenvalues 
and eigenvectors is in general very difficult. However, we may have 
good reason on experimental grounds for believing that the dynamical 
variable can be measured and then we may reasonably assume that it 
is an observable even though the mathematical proof is missing. This is 
a thing we shall frequently do during the course of development of the 
thedry, e.g. we shall assume the energy of any dynamical system to be 
always an observable, even though it is beyond the power of present- 
day mathematical analysis to prove it so except in simple oases. 

In the special case when the real dynamical variable is a number, 
every state is an eigenstate and the dynamical variable is obviously 
an observable. Any measurement of it always gives the same result, 
so it is just a physical constant, like the charge on an electron. 
A physical constant in quantum mechanics may thus be looked upon 
either as an observable with a single eigenvalue or as a mere number 
appearing in the equations, the two points of view being equivalent. 

If the real dynamical variable satisfies an algebraic equation, then 
the result (j3) of the preceding section shows that the dynamical 
variable is an observable. Such an observable has a finite number 
of eigenvalues. Conversely, any observable with a finite number of 
eigenvalues satisfies an algebraic equation, since if the observable £ 
has as its eigenvalues £', £',..., £», then 

tf-n(£-n...«-« \p> = o 

holds for | P> any eigenket of £, and thus it holds for any |P> what- 
ever, because any ket can be expressed as a sum of eigenkets of £ 
on account of £ being an observable. Hence 

: )(f-£ # )...(£-£^) = o. . (26) 

As an example we may consider the linear operator | \AXA |, where 
]A> is a normalised ket. This linear operator is real according to (7), 
and its square is 

; , V4gJl)* - \AXA\A><A\ = \AXA\ 


(27) 
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since <A|A> = 1. Thus its square equals itself and so it satisfies an 
algebraic equation and is an observable. Its eigenvalues are 1 and 0, 
with \Ay as the eigenket belonging to the eigenvalue l*ahd all kets 
orthogonal to |A> as eigenkets belonging to the eigenvalue 0. A 
measurement of the observable thus certainly gives the result 1 if 
the dynamical system is in the state corresponding to |A> and the 
result 0 if the system is in any orthogonal state, so the observable 
may be described as the quantity which determines whether the 
system is in the state \A} or not. 

Before concluding this section we should examine the conditions 
for an integral such as occurs in (24) to be significant. Suppose |X> 
and | r> are two kets which can be expressed as integrals of eigenkets 
of the observable £, 

|X> = J If |F> = J \t’y>d£', 

x and y being used as labels to distinguish the two integrands. Then 
we have, taking the conjugate imaginary of the first equation and 
multiplying by the second 

<Z|r> = JJ<f*iry>«'«'. (28) 

Consider now the single integral 

j<?x\i'y>d£\ (29) 

From the orthogonality theorem, the integrand here must vanish 
over the whole range of integration except the one point fj* = £'. 
If the integrand is finite at this point, the integral (29) vanishes, and 
if this holds for all £', we get from (28) that <X| 7> vanishes. Now 
in general <X| 7> does not vanish, so in general <£'s|fty> must be 
infinitely great in such a way as to make (29) non-vanishing and 
finite. The form of infinity required for this will be discussed in § 15. 

In our work up to the present it has been implied that our bra and 
ket vectors are of finite length and their scalar products are finite. 
We see now the need for relaxing this condition when we are dealing 
with eigenvectors of an observable whose eigenvalues form a range. 
If we did not relax it, the phenomenon of ranges of eigenvalues could 
not occur and our theory would be too weak for most practical 
problems. 
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Taking | F> = |Z> above, we get the result that in general <£'x\px> 
is infinitely great. We shall assume that if \pz) ^ 0 

J <£'x\£'xy d? > o, (30) 

. as the axiom corresponding to (8) of § 6 for vectors of infinite 
length. 

The space of bra or ket vectors when the vectors are restricted to 
be of finite length and to have finite scalar products is called by 
mathematicians a Hilbert space. The bra and ket vectors that we 
now use form a more general spaoe than a Hilbert space. 

We can now see that the expansion of a ket |P> in the form of the 
right-hand side of (25) is unique, provided there are not two or more 
terms in the sum referring to the same eigenvalue. To prove this 
result; let us suppose that two different expansions of ]P> are pos- 
sible. Then by subtracting one from the other, we get an equation 

of the form 0 = J |f o> d? + 1 1**>, (31) 

a and b being used as new labels for the eigenvectors, and the sum 
over 8 including all terms left after the subtraction of one sum from 
the other. If there is a term in the sum in (31) referring to an eigen- 
value P not in the range, we get, by multiplying (31) on the left by 
<£*6| and using the orthogonality theorem, 

0 = <Pb\?b>, 

which contradicts (8) of § 6. Again, if the integrand in (31) does not 
vanish for some eigenvalue not equal to any p occurring in the 
sum, we get, by multiplying (31) on the left by <£'a| and using the 
orthogonality theorem, 

0 = J<Wa>df, 

which contradicts (30). Finally, if there is a term in the sum in (31) 
Referring to an eigenvalue P in the range, we get, multiplying (31) on 
the left by <#£|, 

o = J <?6|f'o> d? +<?b\?b> (32) 

and multiplying (31) on the left by <£b| 

0 = l<ea\('a>d?+<ea\eb>. (S3) 

Now tiie integral in (33) is finite, so is finite and is 

finite. The integral in (32) most then he zero, bo <£%!£%> is zero and 
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we again have a contradiction. Thus every term in (31) must vanish 
and the expansion of a ket |f*) in the form of the right-hand side of 
(26) must be unique. 

11. Functions of observables 

Let f be an -observable. We can multiply it by any real n um ber k 
and get another observable k£. In order that our theory may be 
self-consistent it is necessary that, when the system is in a state suoh 
that a measurement of the observable f certainly gives the result £', 
a measurement of the observable k£ shall certainly give the result k£'. 
It is easily verified that this condition is fulfilled. The ket correspond- 
ing to a state for which a measurement of £ certainly gives the result 
£' is an eigenket of £, |f'> say, satisfying 

mi. ,. , . = 

This equation leads to 

k£\£’> = W\£'\ 

showing that |f y is an eigenket of k£ belonging to the eigenvalue k£' , 
and thus that a measurement of k£ will certainly give the result k£'. 

More generally, we may take any real function of £, f(£) say, and 
consider it as a new observable which is automatically measured 
whenever £ is measured, since an experimental determination of the 
value of £ also provides the value of /(f). We need not restrict /(f) to 
be real, and then its real and pure imaginary parts are two observables 
which are automatically measured when £ is measured. Forthe theory 
to be consistent it is necessary that, when the system is in a state 
such that a measurement of £ certainly gives the result £', a measure- 
ment of the real and pure imaginary parts of /(f) shall certainly give 
for results the real and pure imaginary parts of/(f '). In the case when 
/(f) is expressible as a power series 

M) = Co-Kf+ C| f*+c,f*+..., 

the c’s being numbers, this condition can again be verified by elemen- 
tary algebra. In the case of more general functions / it may not be 
possible to verify the condition. The condition may then be used to 
define /(f), which we have not yet defined mathematically. In this 
way we can get a more general definition of a function of an observ- 
able than is provided by power series. 

We define /(f) in general to be that linear operator which satisfies 
M\?> =/(f')lf> '(34) 
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for every eigenket |f '> of f ,/(f ') being a number for each eigenvalue f'. 
It is easily seen that this definition is self-oonsistent when applied to 
eigenkets |f '> that are not independent. If we have an eigenket |f'.4> 
dependent on other eigenkets of f , these other eigenkets must all 
belong to the same eigenvalue f ', otherwise we should have an equa- 
tion of the type (31), whioh we have seen is impossible. On multiplying 
the equation whioh expresses |f'-<4> linearly in terms of the other 
eigenkets of f by /(f) on the left, we merely multiply each term in it 
by the number /(£'), so we obviously get a consistent equation. 
Further, equation (34) is sufficient to define the linear operator /(f) 
completely, since to get the result of /(f) multiplied into an arbitrary 
ket |P>, we have only to expand |P> in the form of the right-hand 
side of ( 25 ) and take 

m\p> =7 /(n ifc> <*f + os) 

The conjugate complex /(f) of /(f) is defined by the conjugate 
imaginary equation to (34), namely 

<fi7(lj=/(f)<fi, 

holding for any eigenbra <f'|, /( f') being the conjugate complex 
function to /(f). Let us replace f here by f' and multiply the 
equation on the right by the arbitrary ket |P>. Then we get, naing 
the expansion (25) for |P>, 

<f i7cF)|p> =/(f)<f|P> 

= f /(f )<f |f«> df + 2/(f)<f|fd> 

= //(fXflfc) df' +/(f)<f|f'd> (30) 

with the help of the orthogonality theorem, |f*d]> being under- 
jj»qd to be aero if f ' is not one of the eigenvalues to whioh the terms 
id ’thq sum in (25) refer. Again, putting the conjugate complex 
6“*®*®°* /(f) f° r /(f) in (35) and multiplying on lie left by <f'|, 
we get 

<cm\p> = / /(f Xf IfO df +/(f)<f|fd>. 

The right-hand ride here equals that of (36), since the integradBs 
vanish for f ' # f", and henoe 

<flM|i > > - <fi/(f|P>. 
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This holds for <|*| any eigenbra and |P> any ket, so 

m -/to. (37) 

j Thus the conjugate complex of the linear operator /(f) is the conjugate 
| complex function f of £. 

It follows as a corollary that if/(f') is a real funotion of £', /(£) is 
a real linear operator, /(f) is then also an observable, since its 
eigenstates form a complete set, every eigenstate of f being also an 
eigenstate of /(f). 

With the above definition we are able to give a meaning to any 
function f of an observable , provided only that the domain of existence 
of the function of a real variable f(x) includes all the eigenvalues of the 
observable. If the domain of existence contains other points besides 
these eigenvalues, then the values of f(x) for these other points will 
not affect the function of the observable. The function need not be 
analytic or continuous. The eigenvalues of a function / of an observ- 
able are just the function / of the eigenvalues of the observable. 

It is important to observe that the possibility of defining a funotion 
/ of an observable requires the existence of a unique number f(x) for 
each value of x which is an eigenvalue of the observable. Thus the 
function f(x) must be single-valued. This may be illustrated by con- 
sidering the question: When we have an observable f(A) which is a 
real funotion of the observable A, is the observable A a function of 
the observable f(A) ? The answer to this is yes, if different eigenvalues 
A' of A always lead to different values of /(A'). If, however, there 
exist two different eigenvalues of A , A 1 and A" say, such that 
f(A') = f(A w ) > then, corresponding to the eigenvalue f(A') of the 
observable f(A) t there will not be a unique eigenvalue of the observ- 
able A and the latter will not be a function of the observable /(A). 

It may easily be^ftrified mathematically, from the definition, that 
the sum or product of two functions of an observable is a funotion 
of that observable and that a funotion of a funotion of an observable 
is a funotion of that observable. Also it is easily seen that the whole 
theory of functions of an observable is symmetrical between bras and 
kets and that we could equally well work from the equation 

<em-ttew\' (38) 

instead of from (34). 

We s hall conclude this section with a discussion of two examples 
which are of gmttt practical importance, namely the reciprocal and 
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the square root. The reciprocal of an observable exists if the observ- 
able does not have the eigenvalue zero. If the observable a does not 
have the eigenvalue zero, the reciprocal observable, which we call or 1 
or 1/a, will satisfy a - V> = a '- V>) 

where |a') is an eigenket of a belonging to the eigenvalue a'. 

aar^a') = = |a'). 

Since this, holds for any eigenket |a'>, we must have 

oar 1 = 1. 

Similarly, art* = 1. 

Either of these equations is sufficient to determine or 1 completely, 
provided at does not have the eigenvalue zero. To prove this in the 
case df (40), let x be any linear operator satisfying the equation 

atX = 1 


(39) 

Henoe 


(40) 

(41) 


and'multiply both sides on the left by the or 1 defined by (39). The 
result is 


and henoe from (41) x = a- 1 . 


Equations (40) and (41) can be used to define the reciprocal, when 
it exists, of a general linear operator a, which need not even be real. 
One of these equations by itself is then not necessarily sufficient. If 
any two linear operators a and p have reciprocals, their product ofi 
has the reciprocal 


WO" 1 = 


(42) 


obtained by taking the reciprocal of each faotor and reversing their 
order. We verify (42) by noting that its right-hand side gives unity 
when multiplied by ap, either on the right or on the left. This reci- 
procal law for produots can be immediately extended to more than 
two factors, i.e., (afy...)-* = ...y-'p-'*-K 

* | 

The square root of an observable a always exists, and is real if a 
has no negative eigenvalues. We write it Va or a*. It satisfies 

V«|«'> = ±V«V>, (43) 


|a'> being an eigenket of a belonging to the eigenvalue at'. Henoe 
VaVa|ot'> =• Vat'VaV) — a'|a'> = a|a'>, 
and since this holds for any eigenket |a'> we must have 

VaVa = a. 


( 44 ) 
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On account of the ambiguity of sign in (43) there will be several 
square roots. To fix one of them we must specify a particular sign 
in (43) for each eigenvalue. This sign may vary irregularly from one 
eigenvalue to the next and equation (43) will always define a linear 
operator Va satisfying (44) and forming a square-root function of a. 
If there is an eigenvalue of a with two or more independent eigenkets 
belonging to it, then we must, according to our definition of a func- 
tion, have the same sign in (43) for each of these eigenkets. If we 
took different signs, however, equation (44) would still hold, and hence 
equation (44) by itself is not sufficient to define Va, except in the 
special case when there is only one independent eigenket of a belong- 
ing to any eigenvalue. 

The number of different square roots of an observable is 2 W , where 
n is the total number of eigenvalues not zero. In practice the square- 
root function is used only for observables without negative eigen- 
values and the particular square root that is useful is the one for 
which the positive sign is always taken in (43). This one will be called 
the positive square root . 

12. The general physical interpretation 

The assumptions that we made at the beginning of § 10 to get a 
physical interpretation of the mathematical theory are of a rather 
special kind, since they can be used only in connexion with eigen- 
states. We need some more general assumption which will enable us 
to extract physical information from the mathematics even when we 
are not dealing with eigenstates. 

In classical mechanics an observable always, as we say, ‘has a 
value* for any particular state of the system. What is there in quan- 
tum mechanics corresponding to this ? If we take any observable ( 
and any two states x and y, corresponding to the vectors <x\ and |y>, 
then we can form the number <a?|£|y>. This number is not very 
closely analogous to the value which an observable can ‘have’ in the 
classical theory, for three reasons, namely, (i) it refers to two states 
of the system, while the olassioal value always refers to one, (ii) it is 
in general not a real number, and (iii) it is not uniquely determined 
by the observable and the states, sinoe the veotors <x \ and | y> oontain 
arbitrary numerical factors. Even if we impose on (jx\ and the 
condition that they shall be normalized, there will still be an undeter- 
mined factor of modulus unity in <*|£| y>. These three reasons cease 
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* 

to Apply, however, if we take the two states to be identical and |y> 
to be the conjugate imaginary vector to <a?|. The number that we 
then get, namely <a?|£|&>, is necessarily real, and also it is uniquely 
determined when/ a; | is normalized, since if we multiply <#| by the 
numerical factor c being some real number, we must multiply 
|*> by e r* and <x\( |*> will be unaltered. 

One might thus be inclined to make the tentative assumption that 
the observable £ ‘has the value’ <*|£|a?> for the state a?, in a sense 
analogous to the classical sense. This would not be satisfactory, 
though, for the following reason. Let us take a second observable rj , 
which would have by the above assumption the value (x\rj\xy for 
this same state. We should then expect, from classical analogy, that 
for this state the sum of the two observables would have a value 
equal to the sum of the values of the two observables separately and 
the product of the two observables would have a value equal to the 
produot of the values of the two observables separately. Actually, the 
tentative assumption would give for the sum of the two observables 
the value which is, in fact, equal to the sum of <a?|£|sc> 

and <a?|7y |a?>, but for the produot it would give the value (x\£rj\xy 
or (x\rrf; |*>, neither of whioh is connected in any simple way with 
<x\{\x> and <x\rj\x\ 

However, since things go wrong only with the product and not with 
the sum, it would be reasonable to call <&|£|ir> the average value of 
the observable £ for the state x. This is because the average of the 
sum of two quantities must equal the sum of their averages, but the 
average of their produot need not equal the product of their averages. 
We therefore make the general assumption that if the measuremen t 
of the observable £ for the sys tem in the state corresponding to \x\ is 
made a large, number q f times, the average of all th e remits obtained wiU 
be <agl£Js>i vr6vided ] a), is normalized. If |s> is not normalized, as i s 
necessarily the case if the state x tem eigenstate of some observable 
belonging to ah eigenvalue in a range, t hej^ 
the average result of a mefymr^eU ^f ?ls^r oportion^ to <a?J£|a g>. 
This general assumption provides a basis for a general physical inter- 
pretation of the theory. 

The expression that an observable ‘has a particular value’ for a 
particular state is permissible in quantum mechanics in the special 
case when a measurement of the observable is certain to lead to the 
particular value, so that the state is an eigenstate of the observable. 
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It may easily be verified from the algebra that, with this restricted 
meaning for an observable ‘having a value’, if twp observables have 
values for a particular state, then for this state the sum .of the two 
observables (if this sum is an observablef) has a value equal to the 
sum of the values of the two observables separately and the product 
of the two observables (if this produot is an observable^) has a value 
equal to the product of the values of the two observables separately. 

In the general case we cannot speak of an observable having a value 
for a particular state, but we can speak of its having an average value 
for the state. We can go further and speak of the probability of its 
having any specified value for the state, meaning the probability of 
this specified value being obtained when one makes a measurement of 
the observable. This probability can be obtained from the general 
assumption in the following way. 

Let the observable be £ and let the state correspond to the normal- 
ized ket |a?>. Then the general assumption tells us, not only that the 
average value of £ is <z|£|:r>, but also that the average value of any 
function of £,/(£) say, is <x|/(£) |x>. Take/(£) to be that function of £ 
which is equal to unity when £ = a, a being some real number, and 
zero otherwise. This function of £ has a meaning according to our 
general theory of functions of an observable, and it may be denoted 
by 8f a in conformity with the general notation of the symbol 8 with 
two suffixes given on p. 62 (equation (17)). The average value of 
this function of £ is just the probability, P a say, of £ having the value 

“• ThuS P a = <*|Sf 0 |*>. (45) 

If a is not an eigenvalue of £, 8^ a multiplied into any eigenket of £ is 
zero, and hence 8$ a = 0 and P a = 0. This agrees with a conclusion 
of § 10, that any result of a measurement of an observable must be 
one of its eigenvalues. * 

If the possible results of a measurement of £^fdrm a range of num- 
bers, the probability of £ having exactly a particular value will be 
zero in most physical problems. The quantity of physical importance 
is then the probability of £ having a value within a small range, say 
from a to a +da. This probability, which we may call P(a) da, is 

t This is not obviously so, since the sum may not have sufficient eigen s tates to 
form a complete set, in which case the sum, considered as a single quantity, would 
not be measurable. 

J Hero the reality condition may fail, as well as the condition for the eigenstates 
to form a oomplete set. 
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equal to the average value of that function of £ which te equal to 
unity fotf £ lying within the range a to a+da and zero otherwise. 
This function of £ has a meaning according to our general theory of 
functions of an observable. Denoting it by *(£), we have 

f4)«i»-<*ix<f)|*>. ( 4 «) 

If the range a to a+da does not include any eigenvalues of £, we 
have as above #(£) = 0 and P(a) = 0. If |x> is not normalized, the 
right-hand sides of (45) and (46) will still be proportional to the 
probability of £ having the value a and lying within the range a to 
a+da respectively. ^ 4 

The assumption of § 10, that a measurement of £ is certain to give 
the result £' if the system is in an eigenstate of £ belonging to the 
eigenvalue £', is consistent with the general assumption for physical 
interpretation and can in fact be deduced from it. Working from the 
general assumption we see that, if |£'> is an eigenket of £ belonging 
to the eigenvalue £', then, in the case of discrete eigenvalues of £, 

* s £« l£'> = 0 u* 1 ® 88 a = 

and in the case of a range of eigenvalues of £ 

X(£)l£'> = 0 unless the range a to a+da includes £'. 

In either oase, for the state corresponding to |£'>, the probability of 
£ having any value other than £' is zero. 

An eigenstate of £ belonging to an eigenvalue £' lying in a range 
is a state which cannot strictly be realized in practice, since it would 
need an infinite amount of precision to get £ to equal exactly £'. 
The most that could be attained in practice would be to get £ to lie 
within a narrow range about the value £'. The system would then 
be in a state approximating to an eigenstate of £. Thus an eigenstate 
belonging to an eigenvalue in a range is a mathematical idealization 
^ of what can be attained in practice. All the same suol^ eigenstates 
play a very useful role in the theory and one oould not very well do 
^without, them . Science contains many exaiqples of theoretical con- 
cepts which are limits of things met with in practice and are useful 
for the precise formulation of laws of nature, although they are not 
realizable experimentally, and this is just one more of them. It miy 
be that the infinite length of the ket vectors corresponding to these 
eigenstates is connected with their unrealizability, and that all realiz- 
able fatez pomspond to ketyectors that can be normalized and that 
form a SBbert space. 
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13. Commutability and compatibility 
A state may be simultaneously an eigenstate of two observables. 
If the state corresponds to the ket vector | A) and the observables are 
£ and rj, we should then have the equations 

e\A> = m >, 1 

V\ A> = VI A>, 

where £' and V are eigenvalues of £ and tj respectively. We can now 
deduce 

£VA> - £VI A> = £V|A> = ?V\ A> = <|A> = tf|A>, 
or * (£^— i?f)|A> == 0 . 

This suggests that the chances for the existence of a simultaneous 
eigenstate are most favourable if £77 — tj£ = 0 and the two observables 
commute*, If they do not commute a simultaneous eigenstate is not 
impossible, but is rather exceptional. On the other hand, if they do 
commu te t here exist so many simulta neous eig enstate s that th ey f orm a 
complete set l as will now be prove d. 

Let £ and rj be two commuting observables. Take an eigenket of 
7 ), |V> say, belonging to the eigenvalue V, and expand it in terms 
of eigenkets of £ in the form of the right-hand side of (26), thus 

IV> = + 2I*W- (47)' 

The eigenkets of £ on the right-hand side here have t) inserted in 
them as an extra label, in order to remind us that they come from 
the expansion of a special ket vector, namely |V>, and not a general 
one as in equation (26). We can now show that each of these eigen- 
kets of £ is also an eigenket of rj belonging to the eigenvalue V> We 
have 

0 = fa— *')IV> = f fa— ’?')l£V c > <*£' + 2 fa— V) l£V<*>- (48) 

♦ j r 

Now the ket 0?— V)l satisfies 

£fa-V)iw> = fa-vm w = fa-v)riw 

— t r (v—v')\€ r, n'd>> 

showing that it is an eigenket of £ belonging to the eigenvalue £ r , 
and similarly the ket fa— V) l£V c > i* eigenket of £ belonging to 
the eigenvalue £'. Equation (48) thus gives an integral plus a sum 
of eigenkets of £ equal to zero, which, as we have seen with equation 
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( 31 ), is impossible unless the integrand and every term in the sum 
vanishes. Hence 

(V-V'MW = 0 , (,-,')! = 0 , 

so that all the £ets appearing on the right-hand side of ( 47 ) are 
eigenkets of v) as well as of £. Equation ( 47 ) now gives |i/> expanded 
in terms of simultaneous eigenkets of £ and rj. Since any ket can be 
expanded in terms of eigenkets | t/> of rj, it follows that any ket can 
be expanded in terms of simultaneous eigenkets of £ and tj, and thus 
the simultaneous eigenstates form a complete set. 

The above simultaneous eigenkets of £ and rj, \£'rj 'c> and |£Vd>, 
are labelled by the eigenvalues £' and or £ r and v\\ to which they 
belong, together with the labels c and d which may also be necessary. 

4 The procedure of using eigenvalues as la 10 s for simultaneous eigen- 
vectors will be generally followed in the future, just as it has been 
followed in the past for eigenvectors of single observables. 

The converse to the above theorem says that, if £ and rj are two 
obaervable 8 such tXk t their timvdta rieous eigenstates form a complete set . 
then ( and g commute. To prove this, we note that, if \£'rj'> is a 
simultaneous eigenket belonging to the eigenvalues £' and vj\ 

= (fV-Vf )I£V> = 0. ( 49 ) 

Since the simultaneous eigenstates form a complete set, an arbitrary 
ket | P> can be expanded in terms of simultaneous eigenkets |£V>, 
for each of which ( 49 ) holds, and hence 

(£17— 7 y£)|P> = 0 
and so £17 — tj£ = 0. 

The idea of simultaneous eigenstates may be extended to more 
than two observables and the above theorem and its converse still 
&old, i.e. if any set of observables commute, each with all the others, 
their simultaneous eigenstates form a complete set, and conversely. 
The same arguments used for the proof with two observables are 
adequate for the general case; e.g., if we have three commuting 
observables £, 17, £, we can expand any simultaneous eigenket of £ 
and vf in terms of eigenkets of £ and then show that each of these 
eigenkets of £ls also an eigenket of £ and of 77. Thus the simultaneous 
eigenket of £ and 17 is expanded in terms of simultaneous eigenkets 
of £, 1 ], and £, and sinoe any ket can be expanded in terms of simul- 
taneous eigenkets of £ and 17, it can also be expanded !h terms of 
simultaneous eigenkets of £, 1 j, and (. 
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The orthogonality theorem applied to simultaneous eigenkets tells 
us that two simultaneous eigenvectors of a set of commuting observ- 
ables are orthogonal if the sets of eigenvalues to which they belong 
differ in any way. ♦ 

Owing to the simultaneous eigenstates of two or more commuting 
observables forming a complete set, we can set up a theory of func- 
tions of two or more commuting observables on the same lines as the 
theory of functions of a single observable given in § 11. If £, rj , £,... 
are commuting observables, we define a general function / of them 
to be that linear operator /(f, rj t £,...) which satisfies 

/(£, rf, W.0 = /(£', V, £',..)! W..>, (50) 

where is any simultaneous eigenket of f , rj 9 £,... belonging 

to the eigenvalues £', ifffc',... . Here / is any function such that* 
/(a,6,c,...) is defined for all values of a, b, c,... which are eigenvalues 
of f, rj 9 5,... respectively. As with a function of a single observable 
defined by (34), we can show that /(£, 77, £,...) is^oompletely deter- 
mined by (50), that 

/tf, 

corresponding to (37), and that if /(a, 6,c,...) is a real function, 
/(£, £,.-.) is real and is an observable. 

We can now proceed to generalize the results (45) and (46). Given 
a set of commuting observables £, rj 9 £,..., we may form that function 
of them which is equal to unity when f = a, y = 5, £ = c,..., a, b , c,... 
being real numbers, and is equal to zero when any of these conditions 
is not fulfilled. This funotion may be written 8f a 8^ 8 and is in 

fact just the product in any order of the factors 8f a , 8 vb , 8^,... defined 
as functions of single observables, as may be seen by substituting this 
product for /(£, tj , £,...) in the left-hand side of (50). The average 
value of this funotion for any state is the probability, P^^ say, of 
( , rj f 5 ,... having the values a, 6, c,... respectively for that state. Thus 
if the state corresponds to the normalized ket vector |a?>, we get from 
our general assumption for physical interpretation 

= ( 81 ) 

Pabe^. is zero unless each of the numbers a,6,c,... is an eigenvalue of 
the corresponding observable. If any of the numbers a,6,c,... is an 
eigenvalue in a range of eigenvalues of the corresponding observable, 
will usually again be zero, but in this case we nought to replace 
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the requirement that this observable shall have exaotly one value by 
the requirement that it shall have a value lying within a small range, 
which involves replacing one of the 8 factors in (51) by a factor like 
the x(£) of equation (46). On carrying out such a replacement for 

each of the observables £, rj, £ whose corresponding numerical 

value a, b , e,... lies in a range of eigenvalues, we shall get a proba- 
bility which does not in general vanish. 

If certain observables commute, there exist states for which they all 
have particular values, in the sense explained at the bottom of p. 46, 
namely the simultaneous eigenstates. Thus one can give a meaning to 
several commuting observables having values at the same time. Further, we 
see from (51) that for any state one can give a meaning to the probability 
of particular results being obtained for simultaneous measurements of 
several commuting observables. This conclusion is an important new 
development. In general one cannot make an observation on a 
system in a definite state without disturbing that state and spoiling 
it for the purposes of a second observation. One cannot then give 
any meaning to the two observations being made simultaneously. 
The above conclusion tells us, though, that in the special case when 
the two observables commute, the observations are to be considered 
as non-interfering or compatible , in such a way that one can give a 
meaning to the two observations being made simultaneously and can 
discuss the probability of any particular results being obtained. The 
two observations may, in fact, be considered as a single observation 
of a more complicated type, the result of which is expressible by two 
numbers instead of a single number. From the point of view of general 
theory , any two or more commuting observables may be counted as a 
single observable , the result of a measurement of which consists of two or 
more numbers . The states for whioh this measurement is certain to 
lead to one particular result are the simultaneous eigenstates. 
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14. Basic vectors 

In the preceding chapters we set up an algebraio scheme involving 
certain abstract quantities of three kinds, namely bra vectors, ket 
vectors, and linear operators, and we expressed some of the funda- 
mental laws of quantum mechanics in terms of them. It would be 
possible to continue to develop the theory in terms of these abstract 
quantities and to use them for applications to particular problems. 
However, for some purposes it is more convenient to replace the 
abstract quantities by sets of numbers with analogous mathematical 
properties and to work in terms of these sets of numbers. The proce- 
dure is similar to using coordinates in geometry, and has the advan- 
tage of giving one greater mathematical power for the solving of 
particular problems. 

The way in which the abstract quantities are to be replaoed by 
numbers is not unique, there being many possible ways corresponding 
to the many systems of coordinates one can have in geometry. Each 
of these ways is called a representation and the set of numbers that 
replace an abstract quantity is called the representative of that 
abstract quantity in the representation. Thus the representative of 
an abstract quantity corresponds to the coordinates of a geometrical 
object. When one has a particular problem to work out in quantum 
mechanics, one can minimize the labour by using a representation 
in which the representatives of the more important abstract quanti- 
ties occurring in that problem are as simple as possible. 

To set up a representation in a general way, we take a complete 
set of bra vectors, i.e. a set such that any bra can be expressed 
linearly in terms of them (as a sum or an integral or possibly an 
integral plus a sum). These bras we call the basic bras of the repre- 
sentation. They are sufficient, as we shall see, to fix the representation 
completely. 

Take any ket |a> and form its scalar product with each of the basic 
bras. The numbers so obtained constitute the representative of |a). 
They are sufficient to determine the ket |a> completely , since if there 
is a second ket, le^) say, for which these numbers are the same, the 
difference |a>— \a^ will have its scalar product with any basic bra 
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vanishing , and hence its scalar product with any bra whatever will 
vanish and |a>— | Oj) itself will vanish. 

We may suppose the basic bras to be labelled by one or more 
parameters, X v A*,..., X u , each of which may take on certain numerical 
values. The basic bras will then be written (X x A*...AJ and the repre- 
sentative of | a> will be written <A X A 2 ...A u |a>. This representative will 
now consist of a set of numbers, one for each set of values that 
ApA,,..., A u may have in their respective domains. Such t of 
numbers just forms a function of the variables Ai, A*,..., A u . Thus the 
representative of a ket may be looked upon either as a set of numbers 
or as a funotion of the variables used to label the basic bras. 

If the number of independent states of our dynamical system is 
finite, equal to n say, it is sufiioient to take n basic bras, which may 
be labelled by a single parameter A taking on the values 1, 2, 3,..., n. 
The representative of any ket |a> now consists of the set of n numbers 
<l|a>, <2|a>, <3|a>,..., <n|a>, which are precisely the coordinates of 
the vector |a> referred to a system of coordinates in the usual way. 
The idea of the representative of a ket vector is just a generalization 
of the idea of the coordinates of an ordinary vector and reduces to 
the latter when the number of dimensions of the space of the ket 
vectors is finite. 

In a general representation there is no need for the basic bras to 
be all independent.' In most representations used in practice, how- 
ever, they are all independent, and also satisfy the more stringent 
condition that any two of them are orthogonal. The representation 
is then called an orthogonal representation . 

Take an orthogonal representation with basic bras <A 1 A S ...A 1I |, 
labelled by parameters X v X 2) ... i X u whose domains are all real. Take 
a ket |a> and form its representative <A 1 *A 1 ...A u |a>. Now form the 
* numbers A 1 <A 1 A 8 ...AJa> and consider them as the representative of 
a new ket |6>. This is permissible since the numbers forming the 
representative of a ket are independent, on account of the basio bras 
being independent. The ket |6> is defined by the equation 

* <A i A 1 ...A ll |6> == A 1 <A 1 A 1 ...A 1l |a>. 

The ket |6> is evidently a linear funotion of the ket |a>, so it may 
be considered as the result of a linear operator applied to |a>. Calling 
this linear operator L v we have 

l6>«A|a> 
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and hence <A 1 A 2 ...A M \L X |a> == A 1 <A 1 A 2 ...A 1I |a>. 

This equation holds for any ket |a>, so we get 

<A 1 A a ...A u \L X = A 1 <A 1 A a ...A w |. (1) 

Equation (1) may be looked upon as the definition of the linear 
operator L v It shows that each basic bra is an eigenbra of L v the 
value of the parameter A x being the eigenvalue belonging to it. 

From the condition that the basic bras are orthogonal we can 
dedu9ri«hat L x is real and is an observable. Let A^, A^,..., A^ and 
AI,AJ,...,A£ be two sets of values for the parameters A 1? A 2 ,..., A w . 
We have, putting A"s for the A’s in (1) and multiplying on the right 
by \X[ AJ...A£>, the conjugate imaginary of the basic bra <A£AJ...A£|, 
<AiAi...A^|L 1 |AIA;...A;> = A^<Ai A^...A^|AI AS...A^>. 
Interchanging A'’s and A*’s, 

<AJAJ...A;|L 1 |AiA;...A;> = A;<AJ A;...A^|Ai Ai...A^>/" 

On account of the basic bras being orthogonal, the right-hand sides 
here vanish unless AJ = A^ for all r from 1 to u, in which case the 
right-hand sides are equal, and they are also real, A^ being real. Thus, 
whether the A #, s are equal to the A”s or not, 

<AiAJ...AJ 4 |L 1 |AJAJ...Ai> - <^^...^1^^^.:^) 

from equation (4) of § 8. Since the <AiAi...AJJ’s form a complete set 
of bras and the |AJ AJ...A£>’s form a complete set of kets, we can 
infer that L ± — L v The further condition required for L x to be an 
observable, namely that its eigenstates shall form a complete set, is 
obviously satisfied since it has as eigenbras the basic bras, which 
form a complete set. 

We can similarly introduce linear operators L % , L a ,..., L u by multi- 
plying <A X Aj.^AJa) by the factors A* A a ,..., A„ in turn and considering, 
the resulting sets of numbers as representatives of kets. Each of these 
£’s can be shown in the same way to have the basio bras as eigenbras 
and to be real and an observable. The basio bras are simultaneous 
eigenbras of all the L 9 s. Since these simultaneous eigenbras form a 
complete set, it follows from a theorem of § 13 that any two of the 
L’s commute. 

It will now be shown that, if £ a ,..., £ u are any set of commuting I 
observables, we can setup an orthogonal representation in which (he basic 1 
bras are simultaneous eigenbras of $ l9 £ a ,..., Let us suppose first that I 
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there is only one independent simultaneous eigenbra of ft, ft,..., ft 
belonging to any set of eigenvalues ft, ft,..., ft. Then we may take 
these simultaneous eigenbras, with arbitrary numerical coefficients, as 
our basic bras. They are all orthogonal on account of the orthogonality 
theorem' (any two of them will have at least one eigenvalue different, 
whioh is sufficient to make them orthogonal) and there are sufficient 
of them to form a complete set, from a result of § 13. They may 
conveniently be labelled by the eigenvalues ft, ft,..., ft to which they 
belong, so that one of them is written <ft ft.. .ft I* 

Passing now to the general case when there are several independent 
simultaneous eigenbras of ft, ft,..., ft belonging to some sets of eigen- 
values, we must pick out from all the simultaneous eigenbras belong- 
ing fa a set of eigenvalues ft, ft,..., ft a complete subset, the members 
of which are all orthogonal to one another. (The condition of com- 
pleteness here means that any simultaneous eigenbra belonging to the 
eigenvalues ft, ft,..., ft can be expressed linearly in terms of the 
members of the subset.) We must do this for each set of eigenvalues 
&>fti'»>ft and then put all the members of all the subsets together 
and take them as the basic bras of the representation. These bras, 
are all orthogonal, two of them being orthogonal from the orthogona- 
lity theorem if they belong to different sets of eigenvalues and from 
the special way in which they were chosen if they belong to the same 
set of eigenvalues, and they form altogether a complete set of bras, 
as any bra can be expressed linearly in terms of simultaneous eigen- 
bras and each simultaneous eigenbra can then be expressed linearly 
in terms of the members of a subset. There are infinitely many ways 
of ohoosing the subsets, and each way provides one orthogonal 
representation. 

For labelling the basio bras in this general case, we may use the 
eigenvalues ft, ft,..., ft to which they belong, together with certain 
additional real variables A ly A,,..., A, say, wjbich must be introduced to 
diatingniah basic vectors belonging to the same set of eigenvalues 
from one another. A basio bra is then written <ftft...ftA 1 A a ...AJ. 
Corresponding to the variables A x , Ag,..., A, we can define linear 
operators L v L t ,..., L v by equations like (1) and can show that these 
linear operators have the basio bras as eigenbras, and that they are 
real and observables, and that they commute with one another and 
with the £’s. The basic bras are now simultaneous eigenbras of all 
the commuting observables ft, ft,..., ft, 1^, L*..., L v . 
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Let us define a complete set of commuting observables to be a set of 
o bservables which all commute with one another and for which there, 
is only one simultaneous eigenstate belonging to any set of eigen- 
values. Then the observables ft, ft,..., ft, L v L g ,..., L v form a complete 
set of commuting observables, there being only one independent simul- 
taneous eigenbra belonging to the eigenvalues ft, ft,..., ft, X v A*,..., A*, 
namely the corresponding basic bra. Similarly the observables 
L l9 L t ,...,L u defined by equation (1) and the following work form 
a complete set of commuting observables. With the help of this 
definition the main results of the present section can be concisely 
formulated thus: 

*^(i) The basic bras of an orthogonal representation are simul- 
taneous eigenbras of a complete set of commuting observ- 
ables. * 

^ (ii) Given a complete set of commuting observables, we can set 
up an orthogonal representation in whioh the basic bras are 
simultaneous eigenbras of this complete set. 

^(iii) Any set of commuting observables can be made into a com- 
plete commuting $et by adding certain observables to it. 

^(iv) A convenient way of labelling the basic bras of an orthogonal 
representation is by means of the eigenvalues of the complete 
set of commuting observables of which the basic bras are 
simultaneous eigenbras. 

The conjugate imaginaries of the basic bras of a representation we 
call the basic kets of the representation. Thus, if the basic bras are 
denoted by <A 1 A a ...A w |, the basic kets will be denoted by IAjA^.-A*). 
The representative of a bra <6| is given by its scalar product with 
each of the basic kets, i.e. by <6|A X X t ...X u y. It may, like the repre- 
sentative of a ket, be looked upon either as a set of numbers or as a 
function of the variables X v A*,..., X u . We have 

<6|A 1 A S ...A V > = <A 1 A a ...A u |6>, 

showing that the representative of a bra is the conjugate complex of the 
representative of the conjugate imaginary ket . In an orthogonal repre- 
sentation, where the basic bras are simultaneous eigenbras of a com- 
plete set of commuting observables, say, the basic kets 

will be simultaneous eigenkets of £ 1 , £|, 

We have not yet considered the lengths of the basic vectors. With 
an orthogonal representation, the natural thing to do is to normalize 
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the basio vectors, rather than leave their lengths arbitrary, and so 
introduce a further stage of simplification into the representation. 
However, it is possible to normalize them only if the parameters 
which label them aU take oh discrete values. If any of these para- 
meters are continuous variables that can take on aU values in a range, 
the basio vectors are eigenvectors ot* some observable belonging to 
eigenvalues in a range and are of infinite length, from the discussion 
in § 10 (see p. 39 and top of p. 40).. Some other procedure is then 
needed to fix/ the numerical factors by which the basio vectors may 
be multiplied. To get a convenient method of handling this question 
a new mathematical notation is required, which will be given in the 
next section. 

15. The 8 function 

Our work in § 10 led us to consider quantities involving a certain 
kind of infinity. To get a precise notation for dealing with these 
infinities, we introduce a quantity 8(a) depending on a parameter x 
satisfying the conditions 
00 

J B(x) dx = 1 
—00 

8(a) = 0 for x ^ 0. 

To get a picture of 8(a), take a function of the real variable x which 
vanishes everywhere except inside a small domain, of length c say, 
surrounding the origin a = 0, and which is so large inside this domain 
that its integral over this domain is unity. The exact shape of the 
function inside this domain does not matter, provided there are no 
unnecessarily wild variations (for example provided the function 
is always of order c* 1 ). Then in the limit' c -* 0 this function will go 
over into 8(a). 

8(a) is not a funotion of a according to the usual mathematical 
definition of a funotion, which requires a function to have a definite 
value for each point in its domain, but is something more general, 
which we may call an ‘improper function’ to show up its difference 
from a function defined by the usual definition. Thus 8(a) is not a 
quantity which can be generally used in mathematical analysis like 
an ordinary function, but its use must be confined to certain simple 
types of expression for -vfhioh it is obvious that no inconsistency 
can arise; 
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The most important property of 8(2) is exemplified by the follow- 
ing equation, s 

ff(zMx)4x=f(Q) r (3) 

—CO 

where f(x) is any continuous function of x. We can easily see the 
validity of this equation from the above picture of 8(2). The left- 
hand side of (3) can depend only on the values of /(x) very dose 
to the origin, so that we may replace /(x) by its value at the origin, 
/(0), without essential error. Equation (3) then follows from the 
first of equations (2). . By making a change of origin in {3), we can 
deduoe the formula „ 

J /(x)S(x— a) dx = f{a), (4) 

—00 

where a is any real number. Thus the process of multiplying a function 
of x by 8(fc — a) and integrating over all x is equivalent to the process of 
substituting afar x. This general result holds also if the funotion of x is 
not a numerical one, but is a veotor or linear operator depending on x . 

The range of integration in (3) and (4) need not be from — co to oo, 
but may be over any domain surrounding the critical point at which 
the 8 funotion does not vanish. In future the limits of integration 
will u s ually be omitted in such equations, it being understood that 
the domain of integration is a suitable one. 

Equations (3) and (4) show that, although an improper function 
does not itself have a well-defined value, when it oocurs as a factor 
in an integrand the integral has a well-defined value. In quantum 
theory, whenever an improper function appears, it will be something 
which is to be used ultimately in an integrand. Therefore it should be 
possible to rewrite the theory in a form in which the improper func- 
tions appear all through only in integrands. One could then eliminate 
the improper functions altogether. The use of improper functions 
thus does not involve any lack of rigour in the theory, but is merely 
a convenient notation, enabling us to express in a concise form 
certain relations which we could, if necessary, rewrite in a form not 
involving improper functions, but only in a cumbersome way which 
would tend to obscure the argument. 

An alter nat ive way of defining the 8 function is as the differ enti al 
coefficient e'(x) of the function e(x) given by 
^ €(») = 0 (x < •) 


(*) 
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We may verify that this is equivalent to the previous definition by 
substituting «'(*) for 8(x) in the left-hand side of (3) and integrating 
by parts. We find, for g x and g t two positive numbers, 

ff(z)*'Udx = J /'(*)«(*) dx 

—Ot -ft 

= i)~ //'(*)<& 

= /(«), 

in agreement with (3). The 8 function appears whenever one differen- 
tiates a discontinuous function. 

There are a number of elementary equations which one can write 
down about 8 functions. These equations are essentially rules of 
manipulation for algebraic work involving 8 functions. The meaning 
of any of these equations is that its two sides give equivalent results 
as factors in an integrand. 

Examples of such equations are 



8(-x) 

= 8(x) 

(6) 


x8(x) 

= 0 , 

(7) 


8(ax) 

= a-* 8 (z) (a > 0 ), 

(8) 


S^-a*) 

= a)+&(z+a)} (a > 0 ), 

(») 

J 8(o- 

-x) da 8{x—b) 

= 8 ( 0 — 6 ), 

( 10 ) 

f(x)8(x-a) 

= /(o)3(z-o). 

( 11 ) 

Equation (6), which merely states that S(z) is an even function of its 


variable x is trivial. To verify (7) take any continuous function of 
x, f(x), Then 

jf(x)x&(x)dx = Q, 

from (3). Thus xS(x) as a factor in integrand is equivalent to 
*ero, which infest the meaning of (7). (8) and (9) may be verified 
by similar elementary arguments. To verify (10) take any continuous 
function of <r, /(a). Then 

jf{a\da j 8(a—x) dx 8(a?— k) = b) dx j f(a) da 8(o— x) 

i J S(x-b)‘dxf(x) _ f /(a) do 8(o— 6). 

Thus the tyo sides of (10) tee equivalent as factors in an integrand 
with a as variable of integration. It may be shown in the same way 
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that they are equivalent also as factors in an integrand with b *m 
variable of integration, so that equation (10) is justified from either 
of these points of view. Equation (11) is also easily justified, with 
the help of (4), from two points of view. * 

Equation (10) would be given by an apfdication of (4) with 
/(a?) = S(aj—6). We have here an illustration of the fact that we may 
often use an improper function as though it were an or dinar y con- 
tinuous function, without getting a wrong result. 

Equation (7) shows that, whenever one divides both sides of an 
equation by a variable x which can take on the value zero, one 
should add on to one side an arbitrary multiple of 8(x), i.e. from aft 
equation A — B 


one cannot infer 


Ajx = B/x , 

A\x = Blx+c8(x), 


( 12 ) 


(13) 


but only 
where c is unknown. 

As an illustration of work with the 8 function, we may consider the 
differentiation of log x. The usual formula 


d , 1 

s' 08 *-; 


(14) 


requires examination for the neighbourhood of x = 0. In order to 
make the. reciprocal function l/x well defined in the neighbourhood 
of x = 0 (in the sense of an improper function) we must impose on 
it an extra condition, such as that its integral from — c to c vanishes. 
With this extra condition, the integral of the right-hand side of (14) 
from — € to € vanishes, while that of the left-hand side of (14) equals 
log ( — 1), so that (14) is not a correct equation. To correct it, we must 
remember that, taking principal values, logs has a pure imaginary 
term in for negative values of a;. Asa; passes through the value zero 
this pure imaginary term vanishes discontinuously^ The different 
tiation of this pure imaginary term gives us the result — tVSfa;), so 
that (14) should read 

\-ini(x). 1 ' ^ (15) 


I 108 *.. 


Hie particular oombina^On of reciprocal function and 8 function 
appearing in (IS) plays an important port in the quantum theory of 
collision processes (see § 50). 



es 


REPRESENTATIONS 


lie 


16. Properties of the basic vectors 
Using the notation of the S function, we can proceed with the theory 
of representations. Let us suppose first that we have a single observ- 
able £ forming by itse^a complete commuting Bet, the condition for 
this being that there is only one eigenstate of £ belonging to any 
eigenvalue £', and let us set up an orthogonal representation in which 
the basic vectors are eigenvectors of £ and are written <£'|, |£'>. 

In the case when the eigenvalues of £ are discrete, we can normalize 
the basic vectors, and we then have 

<fir> = 0 (£'* £'), 

<m = i. 

These equations can be combined into the single equation 

<fir> = Sfc, ( 16 ) 

where the symbol 8 with two suffixes, which we shall often use in the 
future, has the meaning 


8„ = 0 when r # s 
= 1 when r = 8. 



In the ease Ttfhen the eigenvalues of £ are continuous we cannot 
nprmalize the basic vectors. If we now consider the quantity <£'|£*> 
with £' fixed and £' varying, we see from the work connected with 
expression (29) of § 10 that this quantity vanishes for £' ^ £' and 
that its integral over a range* of £' extending through the value £' 
is finite, equal to c say. Thus 

- <£'ir> = c §(£'—£*). 

From (30) of § 10, c is a positive number. It may vary with £', so 
we should. wi$te it c(£') or a' for brevity, and thus we have 

* <f ir> = c'8(£'-n; ' (is) 

Alternatively, we have 


.. // <£'!£'> = c'8(£'-£'), (19) 

where e* is start for e(£'), the right-hand sides of (18) and (19) being 
equal on aooount of (11). 

Let uspass to another representation whose basio vectors are 
eigenvectors of £, the new basio vectors being numerical multiples of 
the previous opes. Calling the new basic vectors <£'*|, |£'*>, with the 
Additional label ? to distinguish them from the previous ones, we have 

<£'*|*=*'<£‘|, !£'*> = F|£'>, 
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where V is short for &(£') and is a number depending on We get 

<p*\e*> = k'k\e ir> = 

with the help of (18). This may be written * ' 

<('*{£'*> = k'Fc' &(?-£') 

from (11). By choosing V so that its modulus is c'~t, whion is possioie 
since c' is positive, we arrange to have 


<nr*> = s(^n. , ( 20 ) 

The lengths of the new basic vectors are now fixed so as to make the 
representation as simple as possible. The way these lengths were 
fixed is in some respects analogous to the normalizing of the basic 
vectors in the case of discrete equation (20) being of the form of 
(16) with the 8 function 8(£'— £”) replacing the 8 symbol Spp of 
equation (16). We shall continue to work with the new representation 
and shall drop the * labels in it to save writing. Thus (20) will now 
be written <f ID = 8(f-f). (21) 

We can develop the theory oh closely parallel lines for the discrete 
and continuous cases. For the discrete case we have, using (16), 

|irxrir> = |im r =ir>, 

the sum being taken over all eigenvalues. This equation holds for 
any basic ket |£*> and hence, since the basic kets form a complete set, 

| irxri = i. ( 22 ) 

This is a useful Equation expressing an important property of the 
basic vectors, namely, tf |Q is muUiptied o n the right hy <£'| the 
remitting linea r operato r y summed f or att equa ls the mtt pverator. 
Equations (16) and (22) give the fundamental properties of the basic 
vectors for the discrete oase. 

Similarly, for the continuous case we have, using (21), ‘ 

f If) df <f |f > = / |f> df 8(f — f ) = |f> • • (Ml). 

from (4) applied with a ket vector for f(x) t the range of integration 
being the range of eigenvalues. This holds for any basic ket* 

and hence „ 

/ |f)df <f[ = I. (24) 
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This is of the same form as (22) with an integral replacing the sum. 
Equations (21) and (24) give the fundamental properties of the basic 
vectors for the continuous case. 

Equations (22) and (24) enable one to expand any bra or ket in 
terms of the basic veotors. For example, we get for the ket |P> in the 
discrete case, by multiplying (22) on the right by |P>, 

u*>-jifxrip>. ( 25 ) 

which gives |P> expanded in terms of the |£'>*s and shows that th? 
coefficients in the expansion are <£'|P>, which are just the numbers 
forming the representative of |P>. Similarly, in the continuous case, 

|P> = J |f>df <f|P>, (26) 

giving |P) as an integral over the )£'>’&> with the coefficient in the 
integrand again just the representative <£'|P> of |P>. The conjugate 
imaginary equations to (25) and (26) would give the bra vector <P| 
expanded in terms of the basic bras. * 

Our present mathematical methods enable us in the continuous 
case to expand any ket as fen integral of eigenkets of £. Jf we do not 
use the 8 funotion notation, the expansion of a general ket will consist 
of an integral plus a sum, as in equation (25) of § 10, but the 8 function 
enables us to replace the sum by an integral in which the integrand 
consists of terms each containing a 8 function as a factor. For 
example, the eigenket |£*> may* be replaced by an integral of eigen- 
kets, as is shown by the second of equations (23). 

If <Q| is any bra and |P> any ket we get, by further applications 
of (22) and (24), <Q|p> = ^ <<3 |^ ><f '|p > (27) 

for discrete £' and 

<g|p> = J«3lf>if'<f|p> (28) 

ior continuous These equations express the scalar product of <Q| 
and |P> in terms of their representatives <Q|£'> and <£'|P>. Equa- 
tion (27) is just the usual formula for the scalar product of two 
vectors in terms of the coordinates of the vectors, and (28) is the 
natural modification of this formula for the oase of continuous £', 
with an integral instead of a sum. 

The generalization of the foregoing work to the case w&en f has 
both discrete and continuous eigenvalues is quite straightforward. 
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Using S' and £* to denote discrete eigenvalues and £' and £* to denote 
continuous eigenvalues, we have the set of equations 

w> = s ee , <m-°. <f in = s(f-n (29) 

4 

as the generalization of (16) or (21). These equations express that 
the basic vectors are all orthogonal, that those belonging to discrete 
eigenvalues are normalized and those belonging to continuous eigen- 
values have their lengths fixed by the same rule as led to (20). From 
(29) we can derive, as the generalization of (22) or (24), 

2if><fi+ f \e>de<e\**i, (so) 

t j 

the range of integration being the range of continuous eigenvalues. 
With the help of (30), we get immediately 

I p> = 2 \? x ?\ p >+ f If) <f l p > ( 31 ) 

as the generalization of (25) or (26), and 

<G|P> 2 \ p >+ f < 9 \?> <£'l p > < 32 ) 

e 


an orthogonal representation in which the basic vectors are simul- 
taneous eigenvectors of all of them, and are written 

Let us suppose &,£» f, (v < u) have discrete eigenvalues and 

£ e+1 £ u have continuous eigenvalues. 

Considor the quantity Fro™ tlie 

orthogonality theorem, it must vanish unless each = f. for 
8 = o-fl,.., «. By extending the work connected with expression 
(29) of § 10 to simultaneous eigenvectors of several commuting 
observables and extending also the axiom (30), we find that the 
(u— v)-fold integral of this quantity with respect to each over 
a range extending through the value £ is a finite positive number. 
nulling this number c', the ' denoting that it is a function of 
£>••»£»&+ we can express our results by the equation 

<£..£ &+1--&I&-& = °' 8 (&+i — ( u )> ( 33 ) 

with one S factor on the right-hand side for each value of a from 
»+l to «. 'We now change the lengths of our basio vectors to as to 


&s the generalization of (27) or (28). 

Let us now pass to the general oase 
observables forming a complete commuting set and set up 



66 REPRESENTATIONS § 16 

make c' unity, by a procedure similar to that which led to (20). By 
a further use of the orthogonality theorem, we get finally 

(34) 

with a two -suffix S symbol on the right-hand side for each £ with 
discrete eigenvalues and a 8 function for each £ with continuous 
eigenvalues. This is the generalization of (16) or (21) to the oase when 
there are several commuting observables in the complete set. 

From (34) we can derive, as the generalization of (22) or (24) 

X . = i, (35) 

the integral being a (w— v)-fold one over all the £ ,5 s with continuous 
eigenvalues and the summation being over all the £”s with discrete 
eigenvalues. Equations (34) and (35) give the fundamental properties 
of the basic vectors in the present case. From (35) we can imme- 
diately write down the generalization of (25) or (26) and of (27) or (28). 

The case we have just considered can be further generalized by 
allowing some of the £’s to have both discrete and continuous eigen- 
values. The modifications required in the equations are quite straight- 
forward, but will not be given here as they are rather cumbersome to 
write down in general form. 

There are some problems in which it is convenient not to make the 
c' of equation (33) equal unity, but to make it equal to some definite 
funotion of the £"s instead. Calling this function of the £”s p- 1 we 
A hen have, instead of (34) 

<£...f„l«> = p'- 1 8 ftfi ..8 {;r ,8(f. +x -^ +1 )..S(f„-Q > (36) 

<*nd instead of (35) we get 

I /••/ P ' d% +v .d£' u <&...£! = 1. (37) 

p is called the weight function of the representation, p' d£ 9+v .d£ u 
being the ‘weight’ attached to a small volume element of the space 
of the variables 

The representations we considered previously all had the weight 
functioi^mity. The introduction of a weight func^pa not unity is 
entirely a matter of convenience and does nbt add Anything to the 
mathematical power of the representation. The basic bras <&...&* | 
of a representation with the weight function p f are connected with 
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the basic bras <&...&( of the corresponding representation with the 
weight function unity by 1 

( 38 ) 

as is easily verified. An example of a useful representation with 
non-unit weight function occurs when one has two f’s which are 
the polar and azimuthal angles 0 and <f> giving a direction in three- 
dimensional space and one takes p = sin0'. One then has the element 
of solid angle smd'dd'dfi occurring in (37). 

17. The representation of linear operators 

In § 14 we saw how to represent ket and bra vectors by sets of 
numbers. We now have to do the same for linear operators, in order 
to have a complete scheme for representing all our abstract quantities 
by sets of numbers. The same basic vectors that we had in § 14 can 
be used again for this purpose. 

Let us suppose the basic vectors are simultaneous eigenvectors of 
a complete set of commuting observables If a is any 

linear operator, we take a general basic bra <&...& | and a general 
basic ket |£J...f£> and form the numbers 

(39) 

These numbers are sufficient to determine a completely, since in the 
first place they determine the ket a|fj.:.££> (as they provide the 
representative of this ket), and the value of this ket for all the basic 
kets | £!...££> determines a. The numbers (39) are called the repre- 
sentative of the linear operator a or of the dynamical variable a. They 
are more complicated than the representative of a ket or bra vector 
in that they involve the parameters that label two basic vectors 
instead of one. 

Let us examine the form of these numbers in simple cases. Take 
first the case when there is only on*£ , forming a complete commuting 
set by itself, and suppose that it has discrete eigenvalues f'. The 
representative of a is then the discrete set of numbers <£'|a|f">’ If 
one had to write out these numbers explicitly, the natural way of 
arranging them would be as a two-dimensional array, thus: 

■ 


( 40 ) 
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where f l , f 2 , f 8 ,.. are all the eigenvalues of f. Such an array is called 
a matrix and the numbers are called the elements of the matrix. We 
make the convention that the elements must always be arranged so 
that those in the same row refer to the same basic bra vector and 
those in the same column refer to the same basic ket vector. 

An element <£'|a|f'> referring to two basic vectors with the same 
label is called a diagonal element of the matrix, as all such elements 
lie on a diagonal. If we put a equal to unity, we have from (16) all 
the diagonal elements equal to unity and all the other elements equal 
to zero. The matrix is then called the unit matrix. 

If a is real, we have 

<rw o = <mr>. (4i) 

The effect of these conditions on the matrix (40) is to make the 
diagonal elements all real and eaoh of the other elements equal the 
conjugate complex of its mirror reflection in the diagonal. The matrix 
is then called a Hermitian matrix . 

If we put ol equal to f, we get for a general element of the matrix 

<f ifir> = f <f ir> = r ««-. (42) 

Thus all the elements not on the diagonal are zero. The matrix is 
then called a diagonal matrix . Its diagonal elements are just equal 
to the eigenvalues of f. More generally, if we put a equal to /(f), a 
function of f , we get 

<f i/(*)ir> =/(£')»«., (43) 

and the matrix is again a diagonal matrix. 

Let us determine the representative of a product of two linear 
operators a and fi in terms of the representatives of the factors. 
From equation (22) with £ m substituted for f ' we obtain 

<fi^ir> = <fj«^ir><rij3ir> 

- j<fi«irxn 3 ir>, (44) 

which gives us the required result. Equation (44) shows that the 
matrix formed by the elements <£ / |a0|£ <r > equals the product of the 
matrices formed by the elements <f'|a|£ # > and <£' |/?|£*> respectively, 
according to the usual mathematical rule for multiplying matrioes. 
This rule gives for the element in the rth row and *th column of the 
product matrix the sum of the product of each dement in the rth 
row of the first factor matrix with the corresponding element in the sth 
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column of the second factor matrix. The multiplication of matrices 
is non-commutative, like the multiplication of linear operators. 

We can summarize our results for the case when there is only one 
£ and it has discrete eigenvalues as follows: 

| (i) Any linear operator is represented by a matrix . 

(ii) The unit operator is represented by the unit matrix . 

(iii) A real linear operator is represented by a Hermitian matrix. 

(iv) £ and functions of £ are represented by diagonal matrices. 

(v) The matrix representing the product of two linear operators is the 
product of the matrices represerUing the two factors . 

Let us now consider the case when there is only one £ and it has 
continuous eigenvalues. The representative of a is now <£'|a|£">, a 
function of two variables £' and £* which can vary continuously. It 
is convenient to call such a function a ‘matrix’, using this word in 
a generalized sense, in order that we may be able to use the same 
terminology for the discrete and continuous cases. One of these 
generalized matrioes cannot, of course, be written out as a two- 
dimensional array like an ordinary matrix, since the number of its 
rows and columns is an infinity equal to the number of points on a 
line, and the number of its elements is an infinity equal to the 
number of points in an area. 

We arrange our definitions concerning these generalized matrices 
so that the rules (i)-(v) which we had above for the discrete case 
hold also for the continuous case. The unit operator is represented 
by 8 (£'—£') and the generalized matrix formed by these elements 
we define to be the unit matrix. We still have equation (41) as the 
condition for a to be real and we define the generalized matrix formed 
by the elements <£'|a|£*> to be Hermitian when it satisfies this 
condition. £ is represented by 

<flfir> = f8 (£'-£') (45) 

and /(£) by <£' |/(£)|£'> = /(fW'-£'), (46) 

and the generalized matrices formed by these elements we define to be 
diagonal matrices . From (11), we could equally well have £' and/(£') 
as the coefficients of 8(£'— £') on the right-hand sides of (45) and (46) 
respectively. Corresponding to equation (44) we now have, from (24) 

= ( 47 ) 

with an integral instead of a sum, and we define the generalized 
matrix formed by the elements on the right-hand side here to be the 
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product of the matrices formed by <£'l a l£*> and <£'|j8|£">. With 
these definitions we secure complete parallelism between the discrete 
and continuous cases and we have the rules (i)-(v) holding for both. 

The question arises how a general diagonal matrix is to be defined 
in the continuous case, as so far we have only defined the right-hand 
sides of (45) and (46) to be examples of diagonal matrices. One 
might be inclined to define as diagonal any matrix whose (£',£*) 
elements all vanish except when £' differs infinitely little from 
but this would not be satisfactory, because an important property 
of diagonal matrices in the discrete case is that they always commute 
with one another and we want this property to hold also in the 
continuous case. In order that the matrix formed by the elements 
<£>10 in the continuous case may commute with that formed by 
the elements on the right-hand side of (45) we must have, using the 
multiplication rule (47), 

/ <m r> <zr rur-n = j e* <r-n <*r <ri«ir>. 

With the help of formula (4), this reduces to 

<*>i r>r==f<fMO (48) 

or a* -n<ewtr> = o. 

This gives, according to the rule by which (13) follows from (12), 

iff mo - *'8(f-n 

where c' is a number that may depend on £'. Thus <£' |a>|£ # > is of the 
form of the right-hand side of (46). For this reason we define only 
matrices whose dements are of the form of the right-hand side of (46) to 
be diagonal matrices . It is easily verified that these matrices all 
commute with one another. One can form other matrices whose 
(£', £ # ) elements all vanish when £' differs appreciably from ff and 
have a different form of singularity when £' equals £* [we shall later 
introduce the derivative 8'(z) of the 8 function and 8'(£'— £*) will 
then be' an example, see § 22 equation (19)], but these other matrices 
are not diagonal according to the definition. 

Let us now pass on to the case when there is only one £ and it has 
both discrete and continuous eigenvalues. Using £*■,£* to denote 
discrete eigenvalues and £', £" to denote continuous eigenvalues, we 
now have the representative of a consisting of four kinds of quanti- 
ties, <ff\*\ff >» <ff\*\ff>> <f'l«ir>. Ttase quantities can all 
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be put together and considered to form a more general kind of matrix 
having some discrete rows and columns and also a continuous range 
of rows and columns. We define unit matrix, Hermitian matrix, 
diagonal matrix, and the product of two matrices also for this more 
general kind of matrix so as to make the rules (i)-(v) still hold. The 
details are a straightforward generalization of what has gone before 
and need not be given explicitly. 

Let us now go back to the general case of several f’s, 

The representative of a, expression (39), may still be looked upon as 
forming a matrix, with rows corresponding to different values of 
and columns corresponding to different values of 
Unless all the £’s have discrete eigenvalues, this matrix will be of the 
generalized kind with continuous ranges of rows and columns. We 
again arrange our definitions so that the rules (i)-(v) hold, with rule 
(iv) generalized to: 

(iv') Each (m = 1,2 ,... y u) and any function of them is repre- 
sented by a diagonal matrix. 

A diagonal matrix is now defined as one whose general element 
is of the form 

- c'8^;..8* ;r , 8(£W; +1 )..8(&-£) (49) 

in the case when f^.., have discrete eigenvalues and £,+ 1 ,.., £ u have 
continuous eigenvalues, c' being any function of the £”s. This defini- 
tion is the generalization of what we had with one £ and makes 
diagonal matrices always commute with one another. The other 
definitions are straightforward and need not be given explicitly. 

We now have a linear operator always represented by a matrix. 
The sum of two linear operators is represented by the sum of the 
matrices representing the operators and this, together with rule (v), 
means that the matrices are subject to the same algebraic relations as 
the linear operalors. If any algebraic equation holds between certain 
linear operators, the same equation must hold between the matrices 
representing those operators. 

The scheme of matrices can be extended to bring in the repre- 
sentatives of ket and bra vectors. The matrices representing linear 
operators are all square matrices with the same number of rows and 
columns, and with, in fact, a one-one correspondence between their 
rows and col umns . We may look upon the representative of a ket 
| Py as a matrix with a single column by setting all the numbers 
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<&•••&! -P) which form this representative one below the other. The 
number of rows in this matrix will be the same as the number of 
rows or columns in the square matrices representing linaftr operators. 
Such a single-column matrix can be multiplied on the left by a square 
matrix representing a linear operator, by a rule 

simila r to that for the multiplication of two square matrices. The 
product is another single-column matrix with elements given by 





From (35) this is Just equal to <£i...£Ja|P>, the representative of 
a|P>. Similarly we may look upon the representative of a bra <Q| 
as a matrix with a single row by setting all the numbers <0 Ifi—O 
side by side. Such a single-row matrix may be multiplied on the 
right by a square matrix <£...&|a|fj...&>, the product being another 
single-row matrix, "which is just the representative of <Q|a. The 
single-row matrix representing <Q| may be multiplied on the right 
*by tie single-column matrix representing |P>, the product being a 
matrix with just a single element, which is equal to <<?|P>. Finally, 
the single-row matrix representing <Q| may be multiplied on the left 
by the single-column matrix representing |P>, the product being a 
square matrix, which is just the representative of |P><Q|. In this 
way all our abstract symbols, linear operators, bra vectors, and ket 
vectors, can be represented by matrices, which are subject to the 
same algebraic relations as the abstract symbols themselves. 


18 . Probability amplitudes 

Representations are of great importance in the physical interpreta- 
tion of quantum mechanics as they provide a convenient method for 
obtaining the probabilities of observables having given values. In 
§ 12 we obtained the probability of an observable having any speci- 
fied value for a given state and in § 13 we generalized this result 
and obtained the probability of a set of commuting observables 
simultaneously having specified values for a given state. Let us now 
apply this result ibo a complete set of commuting observables, say the 
set of f’s wjrioh we have 'been dealing with already. According to 
formula (51} of § 13, the probability of each ( r having the value (' r 
for the state^corresponding to the normalized ket vector ^) is 

Ptu: = 


(50) 
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If the ( 9 b all have discrete eigenvalues, we can use (35) with v = u 
and no integrals, and get 

si.»L 

= ( 51 ) 

We thus get the simple result that the probability of the £’s Mving the 
values £' is just the squa re of the modul us of the appropriate coord inate 
of the normalized ket vect or corr esponding to the state concerned. 

If the £’s do not all have discrete eigenvalues, but if, say, f l5 .., £ v 
have discrete eigenvalues and have continuous eigenvalues, 

then to get something physically significant we must obtain the 
probability of eaoh £ r (r = 1 v) having a specified value f' and each 
( 9 i 8 = t>+l,..,w) lying in a specified small range £ to For 

this purpose we must replace each factor 8^ in (50) by a factor 
which is that function of the observable £ 8 which is equal to unity 
for £ 8 within the range ^ to £ B +di;’ 8 and zero otherwise. Proceeding 
as before with the help of (35), we obtain for this probability 

(52) 

Thus in every case the probability distribution of values for the ps is 
given by the square q f the modulu s of the re pr esentative of the norma- 
lized ket vector corresponding to the state concerned. 


The numbers which form the representative of a normalized ket 
(or bra) may for this reason be called probability amp l itudes . The 
square of the modulus of ja probability dfeipli tude^ is an ordinary 
probabihty , or a probabi lity per unit range for those varia bles t hat 
have continuous ranges of val ues. 

We may be interested in a state whose corresponding ket |a> cannot 
be normalized. This occurs, for example, if the state is an eigenstate 
of some observable belonging to an eigenvalue lyjgig in a range of 
eigenvalues. The formula (51) or (52) can then still be used to give 
the relative probability of the £’s having specified valines or having 
values lying in specified small ranges, i.e. it will give correctly the 
ratios of the probabilities for different £”s. The numbep^^ijbl^) 
may then be called relative probability amplitudes. * 
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The representation for whioh the above results hold is characterized 
by the basic vectors being simultaneous eigenvectors of all the f’s. 
It may also be characterized by the requirement that each of the f’s 
shall be represented by a diagonal matrix, this condition being easily 
seen to be equivalent to the previous one. The latter characterization 
is usually the more convenient one. For brevity, we shall formulate 
it as each of the £’s 'being diagonal in the representation \ 

Provided the £*s form a complete set of commuting observables, 
the representation is completely determined by the characterization, 
apart frdbi arbitrary phase factors in the basic vectors. Each basic bra 
(fitful may be multiplied by eV, where y is any real function of 
the variables without changing any of the conditions which 

the representation has to satisfy, i.e. the condition that the f’s are 
diagonal or that the basic vectors are simultaneous eigenvectors of 
the £’s, and the fundamental properties of the basic vectors (34) and 
(35). With the basic bras changed in this way, the representative 
<&...£ul-P> of a ket |P> gets multiplied by e*/, the representative 
<OI£i— Su> of a bra (Q\ gets multiplied by e-'Y' and the representa- 
tive of a linear operator a gets multiplied by 

The probabilities or relative probabilities (51), (52) are, of course, 
unaltered. 

The probabilities that one calculates in practical problems in 
quantum mechanics are nearly always obtained from the squares 
of the moduli of probability amplitudes or relative probability ampli- 
tudes. Even when one is interested only in the probability of an 
incomplete set of commuting observables having specified values, it 
is usually necessary first to make the set a complete one by the 
introduction of some extra commuting observables and to obtain 
the probability of the cdlnplete set having specified values (as the 
square of the modulus of a probability amplitude), and then to sum 
* or integrate over all possible values of the extra observables. A 
more direct application of formula (51) of § 13 is usually not 
practicable. 

To introduce ^representation in practice 

(i) We look for observables which we would like to have diagonal, 
either because we are interested in their probabilities or for 
reasons of mathematical simplicity ; 

(ii) We must see that they all commute — a necessary condition 
sinqe diagonal matrices always commute ; 
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(iii) We then see that they form a complete commuting set, and 
if not we add some more commuting observables to them to 
make them into a complete commuting set ; 

(iv) We set up an orthogonal representation with this complete 
cbmmuting set diagonal. 

The representation is then completely determined except for the 
arbitrary phase factors. For most purposes the arbitrary phase 
factors are unimportant and trivial, so that we may count the 
representation as being completely determined by the observables 
that are diagonal in it. This fact is already implied in our notation, 
since the only indication in a representative of the representation to 
which it belongs are the letters denoting the observables that are, 
diagonal. 

It may be that we are interested in two representations for the 
same dynami cal sy stem. Suppose that in one of them the complete 
set of commuting observables are diagonal and the basic 

bras are and in the other the complete set of commtiting 

observables r) v ...,7j w are diagonal and the basic bras are 
A ket |P> will now have the two representatives P> and 

<%...^|P>. If have discrete eigenvalues and £„+i,..,f u have 

continuous eigenvalues and if 7j lf .. 9 rj z have discrete eigenvalues and 
773 + 1 ,.., Vw bave continuous eigenvalues, we get from (36) 


<rh...7,'JP> = I /../ d& +v .de u (63) 

and interchanging £’s and 77 ’s 



These are the transformation equations which give one representative 
of |P> in terms of the other. They show that either representative 
is expressible linearly in terms of the other, with the quantities 


<%•••’?» !&•••&>> ( S5 ) 

as. coefficients. These quantities are called the transformation func - 
tions. Similar equations may be written down to connect the two 
representatives of a bra vector or of a linear operator. The trans- 
formation functions (55) are in every case the means which enable 
one to pass from one representative to the other. Eaoh of the 
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transformation functions is the conjugate complex of the other, and 
they satisfy the conditions 

^ Wi-vJfvO d? v+1 ..df u 

= 1 Vx+l)* ^(Vw Vw) (®®) 

and the corresponding conditions with £’s and t/s interchanged, as 
may be verified from (35) and (34) and the corresponding equations 
for the rf s. 

Transformation functions are examples of probability amplitudes 
or relative probability amplitudes. Let us take the case when all the 
f's and all the ^’s have discrete eigenvalues. Then the basic ket 
I %*••?&;> is normalized, so that its representative in the ^-representa- 
tion, iB a probability amplitude for each set of values 

for the £”s. The state to which these probability amplitudes refer, 
namely the state corresponding to | y'l—y'u,}, is characterized by the 
condition that a simultaneous measurement of rj w is certain to 
lead to the results ifo. Thus is the proba- 
bility of the £’a having the values for the state for which the 

t}’b certainly have the values Since 

we have the theorem of reciprocity — the probability of the gs having 
the values £’ for the state for which the rfs certainly have the values if 
is equal to the probability of the rfs having the values if for the stale for 
which the (*s certainly have the values 

If all the if a have discrete eigenvalues and some of the f’s have 
continuous eigenvalues, Ktfi—iuWi— ifo,>| a *still gives the probability 
distribution of values for the f’s for the state for which the if s cer- 
tainly have the values ifl- If some of the a^s have continuous eigen- 
values, is not normalized and then gives 

only the relative probability distribution offvalues for the fa for the 
state for which the y’a certainly hav^ha*values if . 


19. Theorems about functions of observables 
We shall illustrate the mathematic^ value of representations by 
using them to prove some theorems. 

v^hboeem 1. A linear operator that commutes with an observable ( 
commutes also with any function of £. 

The theorem is obviously true when the function is expressible as 
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a power series. To prove it generally, let oj be the linear operator, 
so that we have the equation 

£co-aj£ = 0. (57) 

Let us introduce a representation in which £ is diagonal. If £ by 
itself does not form a complete commuting set of observables, we must 
make it into a complete commuting set by adding certain observables, 
(3 say, to it, and then take the representation in which £ and the j9’s 
are diagonal. (The case when £ does form a complete commuting set 
by itself can be looked upon as a special case of the preceding one 
with the number of j3 variables zero.) In this representation equation 
(57) becomes <f/3'|£a>-wf O') = 0, 

which reduces to 


= o. 

In the case when the eigenvalues of £ are discrete, this equation 
shows that all the matrix elements <£'j8'|w|£70 of cj vanish except 
those for which £' = £'. In the case when the eigenvalues of £ are 
continuous it shows, like equation (48), that <£7J'|a> |£TO is of the 

form <m»pp> - cw-n 


where c is some function of £' and the /3'’s and /T’s. In either case 
we may say that the matrix representing co ( is diagonal with respect 
to £\ If/(£) denotes any function of £ in accordance with the general 
theory of § 1 1, which requires /(£ w ) to be defined for £ w any eigenvalue 
of £, we can deduce in either case 

/(£')<£'£' l» l£T> - <£'£' l» l£T>/(D - o. 

This gives <£'£' |/(£) w-oj/(£) |£'0*> = 0, 


so that /(£)a>— a>/(£) = 0 

and the theorem is jfjfked. 

As a special case of the theorem, we have the result that any 
observable that commutetfwiifc an observable £ also commutes with 
any function of £. This result appears as a physical necessity when 
we identify, as in § 13, the condition of oommutability of two 
observables with the condition of compatibility of the correspond- 
ing observations. Any observation that is compatible with the 
measurement of ail observable £ must also be compatible with the 
measurement of /(£), since any measurement of £ includes in itself 
a measurement of /(£). 
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v^Fheorem 2. A linear operator that commutes with each of a complete 
set of Commoting observables is a function of those observables . 

Let a) be the linear operator and tutto-.tu the complete set of 
commuting observables, and set up a representation with these 
observables diagonal. Since eo commutes with each of the £’s, the 
matrix representing it is diagonal with respect to each of the £’s, 
by the argument we had above. This matrix is therefore a diagonal 
matrix and is of the form (49), involving a number c f which is a 
function of the £”s. It thus represents the function of the £’s that 
c' is of the £"s, and hence tu equals this function of the f’s. 

Theorem 3. If an observable £ and a linear operator g are such that 
any Unear operator that commutes with £ also commutes with g, then g 
is a function of £. 

This is the converse of Theorem 1. To prove it, we use the same 
representation with £ diagonal as we had for Theorem 1. In the first 
place, we see that g must commute with £ itself, and hence the 
representative of g must be diagonal with respect to £, i.e. it must 
be of the form 

<£trWfr>-*{Wtniht or 

according to whether £ has discrete or continuous eigenvalues. Now 
let co be any linear operator that commutes with £, so that its 
representative is of the form * 

<ep\»\rr> = wrinke or wmne-n- 

By hypothesis co must also commute with g, so that 

W\gw-wg\£T> = 0. (58) 

If we suppose for definiteness that the jS’s have discrete eigenvalues, 
(58) leads, with the help of the law of matrix multiplication, to 

2 wppwrn-wrrwet *r» = «, m. 

P* 

the left-hand side of (58) being equal to the left-hand side of (59) 
multiplied by 8fp or %('— (")• Equation (59) must hold for all 
functions b(('/3’ ft'). We can deduce that 

a(?m = 0 for,/S'#j8', 

a(t'm=«wn 

The first of these results shows that the matrix representing g is 
diagonal and the second shows that o(f') S'/S') is a function of £' only. 
We can no# infer that g is that function of ( which a(£'B'B') is of £'. 
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so the theorem is proved. The proof is analogous if some of the jS’s 
have continuous eigenvalues. 

Theorems 1 and 3 are still valid if we replace the observable f by 
any set of commuting observables fi,f 2 ,..,fr, only formal changes 
being needed in the proofs. 

20. Developments in notation 

The theory of representations that we have developed provides a 
* general system for labelling kefs and bras. In a representation in which 
the complete set of commuting observables ( v ... 9 f u are diagonal any 
ket |P> will have a representative <£...£JP>, or <f'|P> for brevity. 
This representative is a definite function of the variables f', say 0(f '). 
The function 0 then determines the ket |P> completely, so it may be 
used to label this ket, to replace the arbitrary label P. In symbols, 

if <n p > = m 

we put | P> = |0(f)>. 

We must put |P> equal to [0(f)> and not |0(f')>, since it does not 
depend on a particular set of eigenvalues for the f’s, but only on the 
form of the function 0. 

With /(f) any function of the observables f tt , /(f) |P> will 
have as its representative 

<f , l/(f)|P>=/(f'Wf). 

Thus according to (60) we put 

m\p> = i/mo>- 

With the help of the second of equations (60) we now get 

• mm> = i mm >, m 

This is a general result holding for any functions / and 0 of the f’s, 
and it shows that th%. vertical line | is not necessary with the new 
notation for a ket^— either side of (61) may be written simply as 
/(f)0(f)>- Thus the rule for the new notation beoomes: — 
if <f'| P> = 0(f') p 

we put |P> = </r(f)>. tyip ) *■ 

We may further shorten tfr( f)> to 0>, leaving the variables f under- 
stood, if no ambiguity arises thereby.* 

The ket 0(f)) may be considered as the product of the linear 
operator 0(f) with a ket which is denoted simply by > without a 
label. We call the ket > the standard ket. Any ket whatever can be 
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expressed as a function of the £’s multiplied into the standard ket. 
For example, taking |P> in (62) to be the basic ket |£*), we find 

ID = («) 

in the case when ( 9 have discrete eigenvalues and f v+1 ,.., ( u have 
continuous eigenvalues. The standard ket is characterized by the 
condition that its representative <£'|> is unity over the whole domain 
of the variable f, as may be seen by putting <ft = 1 in (62). 

A further contraction may be made in the notation, namely to 
leave the symbol > for the standard ket understood. A ket is then 
written simply as $(£), a function of the observables (. A function 
of the (’a used in this way to denote a ket is called a wave function . f 
The system of notation provided by wave functions is the one usually 
used by most authors for calculations in quantum mechanics. In 
using it one should remember that each wave function is understood 
to have the standard ket multiplied into it on the right, which 
prevents one from multiplying the wave function by any operator 
on the right. Wav e functions can be multiplied by operators only on 
the left. This distinguishes them from ordinary functions of the £’s, 
which are operators and can be multiplied by operators on either the 
left or the right. A waye j^ction is just the representative of a ket 
exp ressed as a func tion of the observables £, instead of eigenvalues £' 
for those observables. The square of its modulus gives the proba- 
bility (or the relativ e probability, if it is not normalized) of the £’s 
haying s pecified values, or lying in specified small ranges, for the 
correspond ing state. 

The new notation for bras may be developed in the same way as 
for kets. A bra <Q| whose representative <Q|£'> Is </>(£') we write 
<^(£')|. With this notation the conjugate imaginary to |^(f)> is 
<$(£) I • Thus the rule that we have used hitherto, that a ket and 
its conjugate imaginary bra are both specified by the same label, 
must be extended to read— nyf the labels of a ket involve complex 
numbers or complex functions, the labels of the conjugate imaginary^ 
bra involve the conjugate complex numbers or functions. As in the 
oase of kets we can show that <^(£)l/(£) fed <#(£)/(£) | are the same, 
so that the vertical line oan be omitted. We can consider <^(£) as 
the product of the linear operator </>{() into the standard bra <, which 

* t The reaeon for this name is that in the early days of quantum iHheKanios all the 
examples of these functions were of the form of waves. The name is not a descriptive 
one from the point of view of the modem general theory. 
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We may leave, 
written as 

the conjugate complex of a wave function. The conjug ate c omplex 
ofa wave function can be multiplied by an y linear operator onthe 
righ t, but cann ot be multiplied by a linear operator on the l eft. We 
can construct triple products of the form </(£)>. Such a triple product 
is a number, equal to /(f) summed or integrated over the whole 
domain of eigenvalues for the f’s, 

m> = z.j~jmd& +1 ..d? u ( 64 ) 

£i«f# 

in the case when f lv ., £ v have discrete eigenvalues and have 

continuous eigenvalues. 

The standard ket and bra are defined with respect to a replanta- 
tion. If we carried through the above work with a different repre- 
sentation in which the complete set of commuting observables 77 are 
diagqpal, or if we merely changed the phase factors in the representa- 
tion with the f’s diagonal, we should get a different standard ket and 
bra. In a piece of work in which more than one standard ket or bra 
appears one must, of course, distinguish them by giving them labels. 

A further development of the notation which is of great importance 
for dealing with complicated dynamical systems will now be discussed. 
Suppose we have a dynamical system describable in terms of dynami- 
cal variables whioh can all be divided into two sets, set A and set E 
say, such that any member of set A commutes with any member of 
set B. A general dynamical variable must be expressible as a function 
of the A -variables and E -variables together. We may consider 
another dynamical system in which the dynamical variables are the 
A -variables only — let us cajl it the A-system. Similarly we may 
consider a third dynamical system in which the dynamical variables 
are the E-variables only — the E-system. The original system can 
then be looked upon as a combination of the A-system and the 
E-system in accordance with the mathematical scheme given below. 

Let us take any ket |a>Tor* the A-system and any ket |6> for the 
E-system. We assume flat they have a product |a>|6> for whioh 
the commutative and distributive axioms of multiplication hold, i.6. 

' * |a>|6> = |6>|«>, 

{c 1 |a 1 >+c,|a 1 >}|6> = c 1 |a 1 >|6>+c,|o l >|6>, 

l®){ci|6 1 )+c a |6,)} = c 1 |a)]6 1 )+c s |a)|6,) l 
0 ■ 


is the conjugate imaginary of the sta 
the standard bra understood, bo that a 


9WS.S7 
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the c’s being numbers. We can give a meaning to any ^-variable 
operating on the product |a>|6> by assuming that it operates only 
on the | a) factor and commutes with the |6> factor, and similarly 
we can give a meaning to any invariable operating on this product 
by assuming that it operates only on the |6> factor and commutes 
with the | o> factor. (This makes every A -variable commute with 
evefy 5-variable.) Thus any dynamical variable of the original 
system can operate on the product |a>|6>, so this product can be 
looked upon as a ket for the original system, and may then be 
written |o6>, the two labels a and b being sufficient to specify it. 
In this way we get the fundamental equations 

\a>\b>= \b>\a>= |oft>. (65) 

The multiplication here is of quite a different kind from any that 
occurs earlier in the theory. The ket vectors |o> and |6> are in two 
different vector spaces and their product is in a third vector space, 
whioh may be called the product of the two previous vector spaces. 
JThe number of dimensions of the product space is equal to the 
(product of the number of dimensions of each of the factor spaces. 
A general ket vector of the product space is not of the form (65), but 
is a sum or integral of kets of this form. 

Let us take a representation for the A -system in which a complete 
set of commuting observables £ A of the A -system are diagonal. We 
shall then have the basic bras <£^| for the A -system. Similarly, taking 
a representation for the 5-system with the observables £ B diagonal, 
we shall have the basic bras <££| for the 5-system. The products 

<£il<£il = <&&l m 

will then provide the basic bras for a representation for the original 
system, in which representation the ( a b and the £ b b will be diagonal. 
The ( a b and ( b b will together form a complete set of commuting 
observables for the original system. From (65) and (66) we get 

<&l»><&|ft> = <&&!«*>. * (67) 

showing that the representative of |o6> equals the product of the 
representatives of |a> and of |6> in their respective representations. 

We can introduce the standard ket, say, for the A -system, 
with respeot to the representation with the ( A ’s diagonal, and also 
the standard ket >* for the 5-system, with respect to the repre- 
sentation with the ( b b diagonal. Their product ) A ) B is then the 
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standard ket for the original system, with respect to the representa- 
tion with the £/s and £ B ’s diagonal. Any ket for the original system 
may be expressed as +(Mb)>a>b- < 68 ) 


It may be that in a certain calculation we wish to use a particular 
representation for the 5-system, say the above representation with 
the £ B ’s diagonal, but do not wish to introduce any particular 
representation for the A -system. It would then be convenient to 
use the standard ket > B for the 5-system and no standard ket for 
the A -system. Under these circumstances we could write any ket 
for the original system as \£ B )} B (69) 


in which |f B > is a ket for the A -system and is also a function of the 
f B ’s, i.e. it is a ket for the A-sjstem for each set of values for^the 
f B ’s — in fact (69) equals (68) if we take 


We may leave the standard ket > B in (69) understood, and then we 
have the general ket for the original system appearing as |f B >, a ket 
for the A-system and a wave function in the variables jj B of the 
5-syst em. Examples of this notation will be used in §§ 66 and 79. 

The above work can be immediately extended to a dynamical 
system describable in terms of dynamical variables which can be 
divided into three or more sets A, such that any member of 

one set co mm utes with any member* of another. Equation (66) gets 
generalized to |o>|6>|c>... = |o6c...>, 

the factors on the left being kets for the component systems and 
the ket on the righ t, being a ket for the original system. Equations 
(66), (67), and (68) get generalized to many faotors in a similar way. 



IV 

THE QUANTUM CONDITIONS 

21. Poisson brackets 

Our work so far has consisted in setting up a general mathematical 
scheme connecting states and observables in quantum mechanics. 
One of th6 dominant features of this scheme is that observables, and 
dynamioal variables in general, appear in it as quantities which do 
not obey the commutative law of multiplication. It now becomes 
necessary for us to obtain equations to replace the commutative law 
of multiplication, equations that will tell us the value of £r]—rj£ when 
£ and rj are any two observables or dynamical variables. Only when 
such equations are known shall we have a complete scheme of 
mechanics with which to replace classical mechanics. These new 
equations are called quantum conditions or commutation relations. 

The problem of finding quantum conditions is not of such a general 
character as those we have been concerned with up to the present. It 
is instead a special problem which presents itself with each particular 
dynamical system one is called upon to study. There is, however, 
a fairly general method of obtaining quantum conditions, applicable 
to a very large class of dynamical systems. This is the method of 
classical analogy and will form the main theme of the present chapter. 
Those dynamioal systems to which this method is not applicable 
must be treated individually and special considerations used in each 
case. 

The .value of classical analogy in the development of quantum 
mechanics depends on the fact that classical mechanics provides a 
valid description of dynamical systems under certain conditions, 
* when the particles and bodies composing the systems are sufficiently 
massive for the disturbance accompanying an observation to be 
negligible. Classical mechanics must therefore be a limiting case of 
quantum mechanics. We should thus expect to find that important 
concepts in classical mechanics correspond to important concepts in 
quantum mechanics, and, from an understanding of the general 
nature of the analogy between classical and quantum mechanics, we 
may hope to get laws and theorems in quantum mechaniea appearing 
as simple generalizations of well-known results in classical mechanics; 
in particular we may hope to get the quantum conditions appearing 
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as a simple generalization of the classical law that all dynamical 
variables commute. 

Let us take a dynamical system composed of a number of particles 
in interaction. As independent dynamical variables for dealing with 
the system we may use the Cartesian coordinates of all the particles 
and the corresponding Cartesian components of velocity of the par- 
ticles. It is, however, more convenient to work with the momentum 
components instead of the velocity components. Let us call the 
coordinates q ri r going from 1 to three times the number of particles, 
and the corresponding momentum components p r . The q’ s and p’s 
are called canonical coordinates and momenta . 

The method of Lagrange’s equations of motion involves introdu- 
cing coordinates q r and momenta p r in a more general way, applicable 
also for a system not composed of particles (e.g. a system containing 
rigid bodies). These more general g’s and p’s are also called canonical 
coordinates and momenta. Any dynamical variable is expressible in 
terms of a set of canonical coordinates and momenta. 

An important concept in general dynamical theory is the Poisson 
Bracket . Any two dynamical variables u and v have a P.B. (Poisson 
Bracket) which we shall denote by [u,v], defined by 



du dv — du dv\ 

dqr^pTWrWrl 


(1) 


u and v being regarded as functions of a set of canonical coordinates 
and momenta q r and p r for the purpose of the differentiations. The 
right-hand side of (1) is independent of which set of canonical 
coordinates and momenta are used, this being a consequence of the 
general definition of canonical coordinates and momenta, ^ so the 
P.B. \u y v\ is well defined. 

The main properties of P.B.’s, which follow at once from their 
definition (1), are 

[«,»]= ~[v, it], (2) 

[«,c] = 0, (3) 


where c is a number (which may be considered as a special case of a 
dynamical variable), 


[«!+«„«] = [%,«]+[«,, t>], 
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= t«i. »]«»+%[“*. »]» 

[«.»!«»] = [«,Vi]»a+Vi[«,t)J. 

Also the identity 

[«, [v, w]]+[v, [w, «]]+[«>, [«, v]] = o 


| 21 

) ( 6 ) 
(«) 


is easily verified. Equations (4) express that the P.B. \u, v] involves 
u and v linearly, while equations (5) correspond to the ordinary rules 
for differentiating a product. 

Let us try to introduce a quantum P.B. which shall be the analogue 
of the classical one. We assume the quantum P.B. to satisfy all the 
conditions (2) to (0), it being now necessary that the order of the 
factors u x and u 2 in the first of equations (5) should be preserved 
throughout the equation, as in the way we have here written it, and 
similarly for the v x and v 2 in the second of equations (5). These condi- 
tions are already sufficient to determine the form of the quantum 
P.B. uniquely, as may be seen from the following argument. We can 
evaluate the P.B. \u x u 2i v x v 2 ] in two different ways, since we can use 
either of the two formulas (5) first, thus, 


= {[%, ViK+ViK. t) s ]}M s +Mj{[«a. vj} 

= [%. »i]»t » aK+“i[«a. »i[«*. «*] 

and 


[«!«*» = K «*, «*> «*] 

= [«1, «*]• 
Equating these two results, we obtain .. 


»»-»*«*) = %)[«». v J- 


Since this condition holds with % and v x tpiite independent of u 2 and 

e a , we must have __ „ „ -i +? 

1 u 1 v l —v 1 u 1 = tft[u 1 ,v jJ, 


where A must not depend on and v lf nor on u 2 and v 2 , and also 
must commute with t^). It follows that ft must be simply 

a number. We want the P.B. of two real variables to fee real, as in 
tiie classical theory, which requires, from the work at the top of p. 28, 
that ft shall be a real number when introduced, as here, with the 
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coefficient i . We are thus led to the following definition for the 
quantum P.B. [u x v] of any two variables u and v, 

uv—vu = ifi[u,v] t * (7) 

in which h is a new universal constant. It has the dimensions of 
action. In order that the theory may agree with experiment, we 
must take H equal to A/27T, where h is the universal constant that 
was introduced by Planck, known as Planck’s constant. It is easily 
verified that the quantum P.B. satisfies all the conditions (2), (3), (4), 
(5), and (6). 

The problem of finding quantum conditions now reduces to the 
problem of determining P.B.’s in quantum mechanics. The strong 
analogy between the quantum P.B. defined by (7) and the classical 
P.B. defined by (1) l§ads us to make the assumption that the quantum 
P.B.’s, or at any rate the simpler ones of them, have the same values 
as the corresponding classical P.B.’s. The simplest P.B.’s are those 
involving the canonical coordinates and momenta themselves and 
have the following values in the classical theory: 

= 0, [> r ,:P s ] = 0, 

[?r.P»] = 8 «- 

We therefore assume that the corresponding quantum P.B.’s also 
have the values given by (8). By eliminating the quantum P.B.’s 
with the help of (7), w# obtain the equations 

trir-Mr = °> PrPs-PsPr = 

<lrP*-PAr = Mr* 
which are the fundamental quantum conditions. They show us where 
the lack of commutability among the canonical coordinates and 
momenta lies. They also provide us with a basis for calculating com- 
mutation relations between other dynamical variables. For instance, 
if £ and r) are any two functions of the q 9 s and p’s expressible as 
power series, we may express £77— or [£, rj] f by repeated applica- 
tions of the laws (2), (3), (4), and (5), in terms of the elementary 
P.B.’s given in (8) and so evaluate it. The result is often, in simple 
cases, the same as the classical result, or departs from the classical 
result only through requiring a special order for factors in a product, 
this order being, of oourse, unimportant in the classical theory. Even 
when £ and are more general functions of the g’s and p’s not ex- 
pressible as power series, equations (9) are still sufficient to fix the 

Cmvm«kJ*r - 
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value of £rj — rjf;, as will beoome clear from the following work. 
Equations (9) thus give the solution of the problem of finding the 
quantum conditions, for all those dynamical systems which have a 
classical analogue and which are describable in terms of canonical 
coordinates and momenta. This does not include all possible systems 
in quantum mechanics. 

Equations (7) and (9) provide the foundation for the analogy 
between quantum mechanics and classical mechanics. They show 
that classical mechanics may be regarded a# the limiting case of quantum 
mechanics when h tends to zero. A P.B. in quantum mechanics is a 
purely algebrak^iption and is thus a rather more fundamental con- 
cept than a oladB&ffix^B. , which can be defined only with reference to 
a set of canonicafboordinates and momenta . For this reason canonical 
coordinates and momenta are of less importance in quantum mechanics 
than in classical mechanics; in fact, we may have a system in quan- 
tum mechanics for which canonical coordinates and momenta do 
not exist and we can still give a meaning to P.B.’s. Such a system 
would be oii&without a classical analogue and we should not be able 
to obtain itMquantum conditions by the method here described. 

From equations (9) we see that two variables with different suffixes 
r and s always commute. It follows that any function of q r and p r 
will commute with any function of q 8 and p 8 when s differs from r. 
Different values of r correspond to different degrees of freedom of the 
dynamical system, so we get the result that dynamical variables 
referring to different degrees of freedom commute. This law, as we have 
derived it from (9), is proved only for dynamical systems with 
classical analogues, but we assume it to hold generally. In this way 
we can make a start on the problem of finding quantum conditions 
for dynamical systems for which canonical’ coordinates and momenta 
t do not exist, provided we can give a meaning to different degrees of 
^freedom, as we may be able to do with the help of physical insight. 

We can now see the physical meaning of the division, which was 
discussed in the preceding section, of the dynamical variables into 
.sets, any member of one set commuting with any member of another. 
Each set corresponds to certain degrees of freedom, or possibly just 
one degree of freedom. The division may correspond to the physical 
prooess of resolving the dynamical system into its constituent parts, 
eaoh constituent being capable of existing by itself as a physical 
system, and the various constituents having to be brought into 
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interaction with one another to produce the original system. Alterna- 
tively the division may be merely a mathematical procedure of 
resolving the dynamical system into degrees of freedom which cannot 
be separated physically, e.g. the system consisting of a particle with 
internal structure may be divided into the degrees of freedom describ- 
ing the motion of the centre of the particle and those describing the 
internal structure. 


22. Schrddinger’s representation 

Let us consider a dynamical system with n degrees of freedom 
having a classical analogue, and thus describable in terms of canonical 
coordinates and momenta q ri p r (r = 1, 2 Wi assume that the 
coordinates q r are all observables and have continuous ranges of eigen- 
values , these assumptions being reasonable from the physical signifi- 
cance of the g’s. Let us set up a representation with the q* s diagonal. 
The question arises whether the q' s form a complete commuting set 
for this dynamical system. It seems pretty obvious from inspection 
that they do. We shall here assume that they do, and the assumption 
will be justified later (see top of p. 92). With the $8 forming a 
complete commuting set, the representation is fixed except for the 
arbitrary phase factors in it. 

Let us consider first the case of n = 1, so that there is only one q 
and p, satisfying = ih . (10) 

Any kef may be written in the standard ket notation 0(g)>. From it 
we can form another ket cb/j/dq}, whose representative is the deriva- 
tive of the original one. This new ket is a linear function of the 
original one and is thus the result of some linear operator applied to 
the original one. Calling this linear operator djdq, we have 




ini 

Equation (11) holding for all functions 0 defines the linear operator 

(12) 


djdq. We have 


4 >-»- 


Let us treat the linear operator djdq according to the general theory 
of linear operators of § 7. We should then be able to apply it to a bra 
(<l>(q), the product (fdjdq being defined, according to (3) of § 7, by 




(IS) 
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for all functions ip(q). Taking representatives, we get 


J dq' M) = J W) dq' (14) 

We can transform the right-hand side by partial integration and get * 
J I?') dq' m =- j mpdq' ■%'), (15) 


provided the contributions from the limit s of integration vanish. 

TU,8iT “ <*£,,■> __W), 

dq' ’ 


showing that <<£— = — (16) 

dq dq v ' 

Thus djdq operating to the left on the conjugate complex of a wave 
function has the meaning of minus differentiation with respect to q. 

*' The validity of this result depends on our being able to make the 
passage from (14) to (16), which requires that we must restrict our- 
selves to bras and kets corresponding to wave functions that satisfy 
suitable boundary conditions. The conditions usually holding in 
praotioe are that they vanish at the boundaries. (Somewhat more 
general conditions will be given in the next section.) These conditions 
do not limit the physical applicability of the theory, but, on the con- 
trary, are usually required also on physical grounds. For example, 
if q is a Cartesian coordinate of a particle, its eigenvalues run from 
—oo to oo, and the physical requirement that the particle has zero 
probability of being at infinity leads to the condition that the wave 
function vanishes for q = ±oo. 

The conjugate complex of the linear operator djdq can be evaluated 
by noting that the conjugate imaginary of d/dq . ifi) or cUft/dqy is 
< [dtp/dq , or — < if djdq from (16). Thus the conjugate complex of djdq 
is — djdq, so dfdq is a pure imaginary linear operator . 


To get the representative of djdq we note that, from an application 
of formula (63) of § 20, 



l2'> = 8 (q-q’)\ 


(17) 

so that 

!'«■> - I'fc-s'O' 


(18) 

and henoe 



(18) 


The representative of djdq involves the derivative of the S function. 
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Let us work out the commutation relation connecting d/dq with q. 
We have , , , , 

$«*>-!«> -«!*>+*>• ( 20 ) 

* Since this holds for any ket i/r>, we have 


d d _ 
dq ® ^ dq 


( 21 ) 


Comparing this result with (10), we see that —iftd/dq satisfies the 
same commutation relation with q that p does . 

To extend the foregoing work to the case of arbitrary n, we write 
the general ket as 0(<?i.. .<?„)> = and introduce the n linear opera- 
tors d/dq r (r = 1,. ..,»), which can operate on it in accordance with 
the formula 

n riiit 

( 22 ) 


Hr Hr 


corresponding to (11). We have 


£>-0 

d <lr 


(23) 


corresponding to (12). Provided we restrict ourselves to bras and 
kets corresponding to wave functions satisfying suitable boundary 
conditions, these linear operators can operate also on bras, in accor- 
dance with the formula a Q , 

<*£-—<£• <24> 

dq r dq r 


corresponding to (16). Thus djdq r can operate to the left on the 
conjugate complex of a wave function, when it has the meaning of 
minus partial differentiation with respect to q r . We find as before 
that each djdq r is a pure imaginary linear operator. Corresponding 
to (21) we have the commutation relations 



d 0 * 

Hr 9 ' 9 , Hr~ ”• 

(25) 

We have further 

•UP 

II 

A 

•U? 
50 1^ 5 

(26) 

showing that 

e 8 _ d 8. 

HrH,~ H.Hr 

(27) 


Comparing (26) and (27) with (9), we see that the linear operators 
— ihd/dq r satisfy the same commutation relations with the q’s and with 
each other that the p's do . 
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It would be possible to take 

■ Pr=-Mdl8q r ( 28 ) 

without getting any inconsistency. This possibility enables us to see 
that the q 9 s must form a complete commuting set of observables, 
since it means that any function of the q’e and p’s could be taken 
to be a function of the q 9 s and —ih d/dq 9 s and then could not commute 
with all the q 9 s unless it is a function of the q 9 a only. 

The equations (28) do not necessarily hold. But in any case the 
quantities p r +ift djdq r each commute with all the q 9 s, so each of them 
is*a function of the q 9 s, from Theorem 2 of § 19. Thus 

p r =-ihdldq r +f r (q). (29) 

Since p r and —ihd/dq r are both real,*/ r (g) must be real. For any 
function / of the q 9 s we have 


Hr dq r dq r 


showing that — . 

Hr J Hr Hr 

With the help of (29) we can now deduce the general formula 


(30) 


Prf-fPr = — ihdf/dq r . (31) 

This formula may be written in P.B. notation 

[f>Pr] = WHr> (32) 

when it is the same as in the classical theory, as follows from (1). 
Multiplying (27) by (— ift) 2 and substituting for —ift d/dq r and —ift djdq 8 
their values given by (29), we get 


(Pr fr)(Ps fa) — (Pa fa)(Pr fr)> 

which reduces, with the help of the quantum condition p r p, = p 8 p r> to 

»rfa+frPa = Pafr+faPr- 
' This reduces further, with the help of (31), to 

QfjHr ~ QfrlHa* (33) 

showing that the functions / f are all of the form 

f r = dF/dq r (34) 

with F independent of r. Equation (29) now beoorii|i 

* p r = -i&dldq r +8F/dq r . . ■ ^ (S5) 

We have been working with a representation which is fixed to the 
extent that the q 9 s must be diagonal in it, but which contains arbitrary 
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phase factors. If the phase factors are changed, the operators Bjdq r 
get changed. It will now be shown that, by a suitable change in the 
phase factors, the function F in (35) can be made to vanish, so that 
equations (28) are made to hold. 

Using stars to distinguish quantities referring to the new repre- 
sentation with the new phase factors, we shall have the new basic 
bras connected with the previous ones by 


<2l-Cl = (36) 

where y = y(q') is a real function of the q n s. The new representa- 
tive of a ket is tfv' times the old one, showing that e^>* = 0>, so 

"’ e8et >•-«-*> (37) 

as the connexion between the new standard ket and the original one. 
The new linear operator (d/dq r )* satisfies, corresponding to (22), 


(—)%>* = . 
[dqj 8q/ 


Hr 


with the help of (37). Using (22), this gives 


(4-)V>* = 

\Hrl Hr Hr 

I b X * a ^ 


showing that' |- 

W Hr 

(38) 

or, with the help of (30), 

0 

W ag r 

(39) 

By choosing y so that 

F = fry- j- a constant, 

(40) 

(35) becomes 

p r = -ih(dldq r )*. 

(41) 


Equation (40) fixes y except for an arbitrary constant, so the repre- 
sentation is fixed except for an arbitrary constant phase factor. 

In this way we see that a representation can be set up in which 
the q* s are diagonal and equations (28) hold. This representation is 
a very useful one for many problems. It will be called SchroMnaer’s 
representation, as it was the representation in terms of which Schro- 
dinger gave hi^riginal formulation of quantum mechanics in 1926. 
Sohrodinger’s representation exists whenever one has canonical q’s 
and p’s, and is completely determined by these q ' s And p’s except for 
an arbitrary constant phase factor. It owes its great convenience to 
its allowing one to express immediately any algebraic function of the 
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q 9 s and p’s of the form of a power series in the p’s as an operator of 
differentiation, e.g. iff(q v ..., q n , p lt . ..,p n ) is such a function, we have 

f(qv->Qn>!Pv-,Pb) = ftev->q n > -ihdldq v ... 9 -ihd/dqj, (42) 

provided we preserve the order of the factors in a product on substi- 
tuting the —ihd/dq 9 s for^he p’s. 

From (23) and (28), we have 

p r > = 0. (43) 

Thus the standard ket in Schrodinger’s representation is characterized 
by the condition that it is a simultaneous eigenket of all the momenta 
belonging to the eigenvalues zero. Some properties of the basic 
vectors of Schrodinger’s representation may also be noted. Equation 
(22) gives 




11 

A 

>4 

$ 

11 



Hence 



gg<S4-3»|. 

(44) 

so that 


<«i-2'nl Pr = 

8q r 

(45) 

Similarly, 

equation (24) leads to 





P,Wl- ••?'»> = 

Mjrj l?i-2n>- 

(46) 


IS. The momentum representation 
Let us take a system with one degree of freedom, describable in 
berms of a q and p with the eigenvalues of q running from — oo to oo, 
ind let us take an eigenket |p'> of p. Its representative in the Sohro- 
iipger representation, <?'|p'>, satisfies 

p'<9'\p'> = <q'\p\p'> = -i^,W\p’>. 

wdtbUfce help of (45) applied to the case of one degree of freedom. 
Tty? solution of this differential equation for (q f |p'> is 

= (47) 

where e' = <j(p'|Ss independent of q’> but may involve 
The representative <g'|p'> does not satisfy the boundary oonditions 
erf vanishing at =4= ± 00 / This gives rise to some difficulty, which 



§23 


THE MOMENTUM REPRESENTATION 


95 


shows itself up most directly in the failure of the orthogonality 
theorem. If we take a second eigenket |p*> of p with representative 

<?v> = 

belonging to a different eigenvalue p\ we shall have 

00 00 

< 2 >V> = J dq' (q'\p') = c'c" J dq'. (48) 

—00 —00 

This integral does not converge according to the usual definition of 
convergence. To bring the theory into order, we adopt a new defini- 
tion of convergence of an integral whose domain extends to infinity, 
analogous to the Ces&ro definition of the sum of an infinite series. 
With this new definition, an integral whose value to the upper limit 
q' is of the form cos a#' or sin a#', with a a real number not zero, is 
counted as zero when q' tends to infinity, i.e. we take the mean value 
of the oscillations, and similarly for the lowpt limit of q' tending to 
minus infinity. This makes the right-hand side of (48) vanish for 
p* ^ p\ so that the orthogonality theorem is restored. Also it makes 
the right-hand sides of (13) and (14) equal when (<f> and iji) are eigen- 
vectors ofp, so that eigenvectors of p become permissible vectors to 
use with the operator djdq. Thus the boundary conditions that the 
representative of a permissible bra or ket has to satisfy become 
extended to allow the representative to oscillate like cos aq' or sinag' 
as q’ goes to infinity or minus infinity. 

For p” very close to p\ the right-hand side of (48) involves a 8 
function. To evaluate it, we need the formula 
00 

je ia *dx= 2w8(a) (49) 

—00 

for real a, which may be proved as follows. The formula evidently 

holds for a different from zero, as both sides are then zero. Further 

* 

we have, for any continuous function /(a), 

ao g oo < ~ 

J f(a) da j e^dx^ j f(a) da 2a- 1 sin ag = 2tt/(0) w 

—oo —g —oo 

m the limit when g tends to infinity. A more complicated argument 
shows that we get the same result if instead of the Jmits g and — g 
we put g 1 and —g 2 , and then let g x and g t tend to infinity in different 
ways (not too widely different). This shows the equivalence of both 
sides of (49) as factors in an integrand, which proves the formula. 
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. With the help of (49), (49) becomes 

<pV> = c'c'2w8 [(p'-p')l*] =*Vc'A8(p'-p') 

= | c'\'hh(p'-p'). (50) 

We have obtained an eigenket of p belonging to any real eigenvalue 
p\ its representative being given by (47), Any ket \Xy can be ex- 
panded in terms of these eigenkets of p, since its representative 
<g'|X> can be expanded in terms of the representatives (47) by 
Fourier analysis. It follows that the momentum p is an observable , 
in agreement with the experimental result that momenta can be 
observed. 

^ symmetry now appears between q and p. Each of them is an 
observable with eigenvalues extending from — oo to oo, and the 
commutation relation connecting q and p, equation (10), remains 
invariant if we interchange q and p and write — i for i. We have.set 
up a representation in which q is diagonal and p — — ihd/dq. It 
follows from the symmetry that we can also set up a representation 
in which p is diagonal and 

q = ihd/dp, , (51) 

the operator djdp being defined by a procedure similar to that used 
for d/dg. This representation will be called the momentum representa- 
tion. It is less useful than the previous Schrodinger representation 
because, while the Schrodinger representation enables one to express 
as an operator of differentiation any function of q and p that is a 
power series in p, the momentum representation enables one so to 
express any function of q and p that is a power series in g, and the 
important quantities in dynamics are almost always power series in 
p but are often not power series in g. All the same the momentum 
representation is of value for certain problems (see § 50). 

Let us calculate' the transformation function <g' |p'> connecting the 
two representations. The basi6 kets |p'> of the momentum representa- 
tion are eigenkets of p and their Schrddinger representatives <g / [p'> 
are given by (47) with the coefficients c' suitably chosen. The phase 
factors of these basic kets must be chosen so as to make (51) hold. 
The easiest wa^ to bring in this condition is to use the symmetry 
between q and p referred to above, according to which <g' |p'> must 
go over into <p'|g'> if we interchange q' and p' and write — » for ». 
Now is equal to the right-hand aid# of (47) and <p'\<fy to the 
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conjugate complex expression, and hence c' must be independent of 
p\ Thus c' is just a number c. Further, we must have 

. ■ <p'\p’> = ate '-*'), ^ 

which shows, on comparison with (50), that |c| = We oan choose 
the arbitrary constant phase factor in either representation so as to 
make c = h-* f and we then get 

<?'|p'> = h-WM (52) 

for the transformation function. 

The foregoing work may easily be generalized to’ a system with 
n degrees of freedom, describable in terms of n q * s and p* s, with the 
eigenvalues of each q running from — oo to oo. Each p will then be 
an observable with eigenvalues running from — oo to oo, and there 
will be symmetry between the set of g’s and the set of p y s, the 
commutation relations remaining invariant if we interchange eaoh q r t 
with the corresponding p r and write — i for t. A momentum repre- 
sentation can be set up in which the p ' s are diagonal and each 

q r = ihdjdp r . (53) 

The transformation function connecting it with the Sohrodinger 
representation will be given by the product of the transformation 
functions for each degree of freedom separately, as is shown by 
formula (67) of § 20, and will thus be 

, = (54) 


24. Heisenberg’s principle of uncertainty 
For a system with one degree of freedom, the Sohrodinger and the 
momentum representatives of a ket |X> are connected by 


00 

<p'|Z> = h~* J e-**'/* dq' <g'|Z>, 


—00 

<g'|Z> = h-* J dp' <y'|Z>. 
—00 


i ( 68 ) 


These formulas have an elementary significance. They show that 
either of the representatives is given , apart from numerical coefficients , 
by the amplitudes of the Fourier components of the other , 

It is interesting to apply (55) to ^ ket whose Sohrodinger repre- 
sentative oonsists of whatsis called a wave packet. This is a function 

8695.57 * L 
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whose value is very small everywhere outside a certain domain, of 
width A q' say, and inside this domain is approximately periodic with 
a definite frequency.! If a Fourier analysis is made of such a wave 
packet, the amplitude of all the Fourier components will be small, 
exoept those in the neighbourhood of the definite frequency. The 
components whose amplitudes are not small will fill up a frequency! 
b%nd whose width is of the order 1/A q\ since two components whose 
frequencies differ by this amount, if in phase in the middle of the 
domain A q\ will be just out of phase and interfering at the ends of 
this domain. Now in the first of equations (55) the variable 
= P'/h plays the part of frequency. Thus with <#' | J> of the 
form of a wave packet, the function <p' \X), being composed of the 
amplitudes of the Fourier components of the wave packet, will be 
small everywhere in the p'-space outside a certain domain of width 
A p' = h/Aq\ 

Let us now apply the physical interpretation of the square of the 
modulus of the representative of a ket as a probability. We find that 
our wave packet represents a state for which a measurement of q is 
almost certain to lead to a result lying in a domain of width A q' and 
a measurement of p is almost certain to lead to a result lying in a 
domain of width A p'. We may say that for this state q has a definite 
value with an error of order A q' and p has a definite value with an 
error of order A p\ The product of these two errors is 

Aq'Ap f = h. (56) 

Thus the more accurately one of the variables q,p has a definite 
value, the less accurately the other has a definite value. For a system 
with several degrees of freedom, equation (56) applies to each degree 
offreedom separately. 

Equation (56) is known as Heisenberg's Principle of Uncertainty . 
It shows clearly the limitations in the possibility of simultaneously 
assigning numerical values, for any particular state, to two non- 
commuting observables, when those observables are a canonical co- 
ordinate and momentum, and provides a plain illustration of how 
observations in quantum mechanics may be incompatible. It also 
shows how classical mechanics, which assumes that numerical values 
oan be assigned simultaneously to all observables, may be a valid 
approximation when h can be considered as small enough to be 

f Frequency here means reciprocal of wave-length. 
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negligible. Equation (56) holds only in the most favourable case, 
which occurs when the representative of the state is of the form of a 
wave packet. Other forms of representative would lead to a A q' and 
A p' whose product is larger than h. 

Heisenberg’s principle of uncertainty shows that, in the limit when 
either q or p is completely determined, the other is completely 
undetermined. This result can also be obtained directly from the 
transformation function (q'\p'}. According to the end of §18, 
IWMW is proportional to the probability of q having a value in 
the small range from q' to q'+dq f for the state for which p certainly 
has the value p', and from (52) this probability is independent of q f 
for a given dq'. Thus if p certainly has a definite value p\ all values 
of q are equally probable. Similarly, if q certainly has a definite value 
q\ all values of p are equally probable. # 

It is evident physically that a state for which all values of q are 
equally probable, or one for which all values of p are equally probable, 
cannot be attained in practice, in the first case because of limitations 
of size and in the second because of limitations of energy. Thus an 
eigenstate of p or an eigenstate of q cannot be attained in practice. 
The argument at the end of § 12 already showed that such eigenstates 
are unattainable, because of the infinite precision that would be 
needed to set them up, and we now have another argument leading 
to the same conclusion. 

25. Displacement operators 

We get a new insight into the meaning of some of the quantum con- 
ditions by making a study of displacement operators. These appear 
in the theory when we take into consideration that the scheme of 
relations between states and dynamical variables given in Chapter II 
is essentially a physical scheme, so that if certain states and dynamical 
variables are connected by some relation, on our displacing them all 
in a definite way (for example, displacing them all through a distance 
8z in the direction of the z-axis of Cartesian coordinates), the new 
states and dynamical variables would have to be connected by the 
same relation. 

The displacement of a state or observable is a perfectly definite 
process physioally, Thus to displaoe a state or observable through a 
distance 8z in the direction of the z-axis, we should merely have to 
displaoe all the apparatus used in preparing the state, or all the 



f 25 


W& t&b quantum conditions 

apparatus required to measure the observable, through the distance 
hx in the direction of the z-axis, and the displaced apparatus would 
define the displaced state or observable. The displacement of a 
dynamical variable must be just as definite as the displacement of 
an observable, because of the close mathematical connexion between 
dynamical variables and observables. A displaced state or dynamical 
variable is uniquely determined by the undisplaced state or dynami- 
cal variable together with the direction and magnitude of the dis- 
placement. 

The displacement of a ket vector is not such a definite thing though. 
If we take a certain ket vector, it will represent a certain state and we 
may displace this state and get a perfectly definite new state, but this 
new state will not determine our displaced ket, but only the direction 
of our ket. We help to fix our displaced ket by requiring 

that it rmlfy |p~~r the same length as the undisplaced ket, but even 
then it is not completely determined, but can still be multiplied by 
an arbitrary phase factor. One would think at first sight that each 
ket one displaces would have a different arbitrary phase factor, 
but with the help of the following argument, we see that it must be 
the same for them all. We make use of the law that superposition 
relations hips b etween states remain invariant under the displace- 
ment ^Jpl^rposition relationship between states is expressed 
mathejaaticaSroy a linear equation between the kets corresponding 
to those states, %r example 

\b> = cMy+ctiBy* m 

where c x and c a arelfenbers, and the invariance of the superposition 
relationship required that the displaced states correspond to kets 
with the same linear equation between them — in our example they 
would correspond to \Rdy, \Ady, \Bd) say, satisfying 

\Rd> = c x \Ady+c t \Bdy. (58) 

We take these kets to be our displaced kets, rather than these kets 
multiplied by arbitrary independent phase factors, which latter 
kets would satisfy a linear equation with different coefficients c^Cj. 
The only arbitrariness now left in the displaced kets is that of a single 
arbitrary phase factor to be multiplied into all of them. 

The condition that linear equations between the kets remain in- 
variant under the displacement and that an equationjpuch as (58) 
holds whenever the corresponding (57) holds, means that the dis- 
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placed kets are linear functions of the undisplaced kets and thus each 
displaced ket | Pd) is the result of some linear operator applied to the 
corresponding undisplaced ket |P>. In symbols, 

I Pd> = D\ P>, (59) 

where D is a linear operator independent of |P) and depending only 
on the displacement. The arbitrary phase factor by which all the 
displaced kets may be multiplied results in D being undetermined 
to the extent of an arbitrary numerical factor of modulus unity. 

With the displacement of kets made definite in the above manner 
and the displacement of bras, of course, made equally definite, 
through their being the conjugate imaginaries of the kets, we can 
now assert that any symbolic equation between kets, bras, and 
dynamical variables must remain invariant under the displacement 
of every symbol occurring in it, on account of such an Equation 
having some physical significance which will not get changed by the 
displacement. 

Take as an example the equation 

<Q\P> = c, 

c being a number. Then we must have 

<Qd\Pd) = c = <Q\P>. ^ (00) 

From the conjugate imaginary of (59) with Q instead O&P,# v 

<0*1 = <Q\D- (61) 

Hence (60) gives (Q\DD\P) = <C|P>. 

Since this holds for arbitrary <§| and |P), we must have 

DD = 1, v (02) 

giving us a general condition which D has to satisfy. 

Take as a second example the equation 

v\P> = | *>, 

where v is any dynamical variable. Then, using v d to denote the 
displaced dynamical variable, we must have 

v d \ Pd) = | Rd}. 

With the help of (59) we get 

v a \ Pdy = D\R) = Dv\Py~ DvD-'lPdy. 

Since | Pd) can be any ket, we must have 

v d = DvD~ X y 


(03) 
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which shows that the linear operator D determines the displacement 
of dynamical variables as well as that of kets and bras. Note that 
the arbitrary numerical factor of modulus unity in D does not affect 
v d , and also it does not affect the validity of (62). 

Let us now pass to an infinitesimal displacement, i.e. taking the 
displacement through the distance 8 x in the direction of the 2-axis, 
let us jgiake 8 x -* 0. From physical continuity we should expeot 
a displaced ket | Pd) to tend to the original |P> and we may further 
expeot the limit 


to exist. This requires that the limi t 

lim (D— 1)/82 




8 x 


(64) 


shall exist. This limit is a linear operator whioh we shall call the 
displacement operator for the 2-direction and denote by d x . The 
arbitrary numerical factor e { Y with y real which we may multiply 
into D must be made to tend to unity as 82 -> 0 and then introduces 
an arbitrariness in d xf namely, d x may be replaced by 


lim ( De*y — 1)/S2 = lim (D— 1 +iy)/ 8 x = d x +ia xy 

&B-H) 8x-*0 

where a x is the limit of y/8 2. Thus d x contains an arbitrary additive 
pure imaginary number. 

For 82 small D — l+ 8 xd x . (66) 

Substituting this into (62), we get 

(l+8*4)(l+8a:^) = 1, 
which reduoes, with neglect of 82 s , to 

8^(e? x +d a ) = 0. 

Thus d x is a pure imaginary linear operator. Substituting (66) into 
(63) we get, with neglect of 82 1 again, 

v d = (I+82AXI— 8xd x ) = v+ 8 x(d x v— vd x ) 9 (66) 

# 

showing that lim (v d —v)f 8 x = d x v—vd x . (67) 


We may describe any dynamical system in terms of the following 
dynamical variables: the Cartesian coordinates 2, y, z of the centre of 
m ass of the system, the components p x ,p yi p M of the total momentum 
of the system, which are the canonical momenta oonjtjgate to x t y,z 
respectively, and any dynamical variables needed for describing 
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internal degrees of freedom of the system. If we suppose If piece 
of apparatus which has been set up to measure a;, to be displaced a 
distance 8x in the direction of the z-axis, it .will measure x—8x, hence 

x d = x—8x. 

Comparing this with (66) for v = x, we obtain 


d x x—xd„ = — 1. 


( 68 ) 


This is the quantum condition connecting d x with x. From fli'milur 
arguments we find that y, z,p xi p yi p z and the internal dynamical vari- 
ables, which are unaffected by the displacement, must commute with 
d x . Comparing these results with (9), we see that ihd x satisfies just 
the same quantum conditions as p x . Their difference, p x —ihd xt 
commutes with all the dynamical variables and must therefore be a 
number. This number, which is necessarily real since p x and ihd x are 
both real, may be made zero by a suitable choice of the arbitrary, 
pure imaginary number that can be added to d x . We then have the 

result .* j ... 

p x = ihd x , (69) 

or the x-component of the total momentum of the system is ih times the 
displacement operator d x . 


This is a fundamental result, which gives a new significance to 
displacement operators. There is a corresponding result, of course, 
also for the y and z displacement operators d y and d B . The quantum 
conditions which state that p xi p y and p z commute with each other 
are now seen to be connected with the fact that displacements in 
different directions are commutable operations. 


26. Unitary transformations > * 

Let U be any linear operator that has a reciprocal U~ l and con- 
rider the equation = UaV -{ (70) 


a being an arbitrary linear operator. This equation may be regarded 
as expressing a transformation from any linear operator a to a 
'Corresponding linear operator &*, and as such it has rather remarkable 
properties. In the first place it should be noted that each a* has the 
same eigenvalues as the corresponding a; s ince, if a' is any eigenvalue 
of a and |a'> is an eigenket belonging to it, we have 


and hence 


*\*’> - «v> 


<x*U| a '> = u<xU-w\<x'y = = ju |«'>, 
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showing that V la'> is an eigenket of a* belonging to the same ei g en- 
value a', and similarly any eigenvalue of a* may be shown to be also 
an eigenvalue of a. Further, if we take sev e ral as th atare connected 
by algebraic equations a nd transform them all according to (70), the_ 
correspon din g <x*’b w Ulbeggimected by the"£me>aigebr aic equations . 
This result follows from the facttlu^lm pro- 

cesses of addition and multiplication are left invariant by the trans- 
formation (70), as is shown by the following equations: 

(oii+cxi)* = XJfa+ocJU- 1 = Uol 1 U^+ Uoc 2 U ~ 1 == of +«?, 

( a i “ 2 )* = ^<* 1*2 EM = Ua ± U-Woc 2 U - 1 = a fa£. 

Let us now see what condition would be imposed on U by the 
requirement th at any real a transform s in to jl real a*. Equation 
(70) may be written ^ = ^ (71) 

Taking the conjugate complex of both sides in accordance with 
(5) of § 8 we find, if a and a* are both real, 

Uol* = <xU. (72) 

Equation (71) gives us Uol*U = UUoc 

and equation (72) gives us 

U<x*U = ocUU. 

Hence UUa = olUU. 

Thus UU commutes with any real linear operator and therefore also 
with any linear operator whatever, since any linear operator can be 
expressed as one real one plus i times another. Hence UU is a 
number. It is obviously real, its conjugate complex according to (5) 
of § 8 being the same as itself, and further it must be a positive 
number, since for any ket |P>, <P|E7Z7|P> is positive as well as 
<P|P>. We can suppose it to be unity without any Iobs of generality 
in the transformation (70). We then have 

UU=l^ (73) 

Equation (73) is equivalent to any of the following 

u = EM, U = EM, EM EM = 1 . (74) 

A matrix or linear operator U that satisfies (73) and (74) is said 
to be unitary and a transformation (70) with unitary U is called a 
f unitary transformation. A unitary transformation transforms real 
linear operators into real linear operators and leayes invariant any 
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algebraio equation between linear operators. It may be considered 
as applying also to kets and bras, in accordance with the equations 
|P*>=E7|P>, <P*| = <P|ff = <P|tM, (76) 

and then it leaves inv^pant any algebraic equation between linear \ 
operators, kets, and bras. It transforms eigenvectors of a into eigen- ' 
vectors of a*. From this one can easily deduce that* it transforms an 
observable into an observable and that it leaves invariant any func- 
tional relation between observables based on the general definition 
of a function given in § 1 1. 

The inverse of a unitary transformation is also a unitary trans- 
formation, since from (74), if U is unitary, U- 1 is also unitary. 
Further, if two unitary transformations are applied in succession, 
the result is a third unitary transformation, as may be verified in 
the following way. Let the two unitary transformations be (70) and 
«t = V0L*V-\ 

The connexion between a f and a is then 

«t = VUocU-W - 1 

= (VU)ol(VU)- 1 (76) 

from (42) of § 11. Now VU is unitary since 

VUVU = UVVU =UU= 1, 
and hence (76) is a unitary transformation. 

The transformation given in the preceding section from undisplaced 
to displaced quantities is an example of a unitary transformation, as 
is shown by equations (62), (63) , corresponding to equations (73), 
(70), and equations (69), 61), corresponding to equations (76). 

In classical mechanics one can make a transformation from the 
canonical coordinates and momenta q ri p r (r = 1,.., ft) to a new set of 
variables gf,pjf (r = l,..,ft) satisfying the same P.B. relations as the 
q*B and p’s, i.e. equations (8) of § 21 with q*’s and p *'\ s replacing the 
g’s andp’s, and can express all dynamical variables in terms of the g*’s 
and p*’s. The q *’ s and p*’s are then also called canonical coordinates 
and momenta and the transformation is called a, contact transforma- 
tion , One can easily verify that the P.B. of any two dynamical 
variables u and t; is correctly given by formula (1) of § 21 with q*’B and 
p*’s instead of g’s andp’s, so that the P.B. relationship is invariant 
under a oontact transformation. This Results in the new canonical 
coordinates and momenta being on the same footing as the original. 11 
ones for many purposes of general dynamical theory, even though the 
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new coordinates q* may not be a set of Lagrangian coordinates but 
may be functions of the Lagrangian coordinates and velocities. 

It will now be shown that, for a quantum dynamical system that 
has a classical analogue, unitary transforma tions in the quan tum the ory 
tire theandloaue of co n tact transjo rmat}^ classical theory 

Unitary transformations are more general than contact transforma- 
tions, since the former can be applied to systems in quantum 
mechanics that ffhve no classical analogue, but for those systems in 
quantum mechanics which are describable in terms of canonical 
ctifaiiriftteg and momenta, the analogy between the two kinds of 
transformation holds. To establish it, we note that a unitary trans- 
formation applied to the quantum variables q r ,p r gives new variables 
q?,pf satisfying the same P.B. relations, since the P.B. relations are 
equivalent to the algebraic relations (9) of § 21 and algebraic relations 
are left invariant by a unitary transformation. Conversely, any real 
variables qf,p? satisfying the P.B. relations for canonical coordinates 
and momenta are connected With the q r ,p r by a unitary transforma- 
tion, as is shown by the following argument. 

We use the Schrodinger representation, and write the basic ket 
ft 8 Wy for brevity. Since we are assuming that the qf,p* 
satisfy the P.B. relations for canonical coordinates and momenta, 
we oan set up a Schrodinger representation referring to them, with 
the diagonal and each p* equal to —ihd/dq*. The basic kets in 
this second Schrodinger representation will be Igf'.. .${'>, which we 
write \q* fs > for brevity. Now introduce the linear operator U defined by 
<q*'\U\q'> = B(q*’-q’), (77) 

where 8(g*'— q') is short for 

S(q*'-q') = 8( 5 r— 3i)S($r— gi). : .8( 3 r— £). (78) 

The conjugate oomplex of (77) is , ... 

<?' W'> = S(?*'-A 

and hendef * Ms 

<q'\VU\q’> = j <q'\U\q*’> dq*' <q*'\U\tf> 


so-that 


uu = i. * ** 


t We use the notation of a single integral sign and dq*' to denote aaintegral over 
all the variables gf', gf gf This abbreviation will be used also in future work/ 
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Thus V is a unitary "bperator. We have further 
<q*'\q*U\q'> = q?8(q*'-q') 

. and '<q*'\Uq r \q'> = 

The right-hand sides of%hese two equations are equal on account of 
the property of the Sfimotion (II) of § 15, and hence 

q*U=Ub 

or qf = Vq T U~ x . 

Again, from (45) and (46), 


<q*'\p:uwy = -m±-Mq*’-q'), 


<q*'\u Pr \q'> - in^(q*'-q% • 

The right-hand sides of these two equations are obviously equal, and 
hence p*U=Up r 

or p* = Up r ZJ- 1 . 

Thus all the conditions for a unitary transformation are verified. 

We get an infinitesimal unitary transformation by taking U in (70) 
to differ by an infinitesimal from unity. Put 

U = 1+ieF, 


where e is infinitesimal, so that its square can be neglected. Then 
tf-i = 1-^VF. 


The unitary condition (73for (74) requires that F shall be real. The 
transformation equation (70) now takes the form 

a * = (l+jcJ^l-jcJ 1 ), 

which gives a*— a = ie(Fot— otF). (79) 


It may b§ written 



. notation 

a*— a = dfc[a, F], 


(80) 


If a is a canonical coordinate or momentum, this is formally the same 

as a classical infinitesimal contact transformation. 

* 
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THE EQUATIONS OF MOTION 

27. Schr5dinger’s form for the equations of motion 
Oub work from § 5 onwards has all been concerned with one instant 
of time. It gav<p the general scheme of relations between states and 
dynamical variables for a dynamical system at one instant of time. 
To get a complete theory of dynamics we must oonsider also the 
connexion between different instants of time. When one makes an 
observation on the dynamical system, the state of the system gets 
changed in an unpredictable way, but in between observations 
causality applies, in quantum mechanics as in classical mechanics, 
and the system is governed by equations of motion which make the 
state at one time determine the state at a later time. These equations 
of motion we now proceed to study. They will apply so long as the 
dynamical system is left undisturbed by any observation or similar 
process. f Their general form can be deduced from the principle of 
superposition of Chapter I. 

Let us consider a particular state of motion throughout the time 
during which the system is left undisturbed. We shall have the state 
at any time t corresponding to a certain ket which depends on t and 
which may be written |<>. If we deal with several of these states of 
motion we distinguish them by giving them labels such as A , and we 
then write the ket which corresponds to the state at time t for one . 
of them | At). The requirement that the state at one time determines 
the state at another time means that | At % > determines \At} except 
for a numerical factor. The principle of superposition applies to these 
states of motion throughout the time during which the system is 
undisturbed, and means that if we take a superposition relation 
holding for certain states at time t 0 and giving rise to a linear equation 
between the corresponding kets, e.g. the equation 

the same superposition relation must hold between the states of 
motion throughout the time during which the system is undisturbed 
and must lead to the same equation between the kets corresponding 

f The preparation of a state is a pipoees of this kind. It often takes the form of 
making an observation and selecting the system when the result of the observation 
tuna out kt be a oertain pro -assigned number. 
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to these states at any time t (in the undisturbed time interval), i.e. 
the equation |j&> = ^Aty+c^Bty, # 


provided the arbitrary numerical factors by whioh these kets may be 
multiplied are suitably chosen. It follows that the |Pt)’s are linear 
functions of the |Pi 0 >’s and eaoh \Pt) is the result of some linear 
operator applied to | Pt 0 }. In symbols 

|P<> = P|B 0 >, * (1) 


where T is a linear operator independent of P and depending only 
on t (and t 0 ). * j 

We now assume that each \PV) has the same length as the corre- 
sponding |P* 0 >. It is not necessarily possible to choose the arbitrary 
numerical factors by which the |P$>*s may be multiplied so as to 
make this so without destroying the linear dependence of the IPJ/s 
on the |P$ 0 >’s, so the new assumption is a physical one and not just 
a question of notation. It involves a kind of sharpening of the 
principle of superposition. The arbitrariness in \Pt) now becomes 
merely a phase factor, which must be independent of P in order that 
the linear dependence of the IPO’s on the IP^/s nray be preserved. 
From the condition that the length of c^Pty+c^Qt} equals that of 
Ci|P*o>+c a |<2*o> f° r any complex numbers c v c a , we can deduce that 

<Qt\Pt> = <Qt 0 \Pt 0 \ (2) 

The connexion between the |PO’s and |P£ 0 >’s is formally similar ' 
to the connexion we had in § 25 between the displaced and undisplaced 
kets, with a process of time displacement instead of the spaoe displace- v 
ment of § 25. Equations (1) and (2) play the part of equations (59) 
and (60) of § 25. We can develop the consequences of these equations 
as in § 25 and can deduce that T contains an arbitrary pumerioal . 
factor of modulus unity and satisfies 


TT = I, (3) 

corresponding to (02) of§ 25, so T ia unitary. We pass to the infinttesi- 
mal case by making t -> t 0 and assume from physical continuity that 


the limit 


tr+U t ^0 


exists. This limit is just the derivative of |Pt 0 > with respeot to t 0 . h 
From (1) it equal# 
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* The limit ^Jpera tor occurring here is, like (64) of § 25, a pure imaginary 
linear operator and is undetermined to the extent of an arbitrary 
additftffe pure imaginary number. Putting this limit operator multi- 
plied by ift equal to H> or rather H(t 0 ) since it may depend on * 0 , 
equation (4) becomes, when written for a general t> 

ib^- = miPty (5) 

# 

Equation (5) gives the general law for the variation with time of 
the ^et corresponding to the state at any time. It is gchrodinger’s 
Jorm ^for the. equations of motion . It involves just one real linear 
operator H{t\ which must be characteristic of the dynanycal system 
under consideration. We assume that H{t) is the total energy of 
ihb system. There are two justifications for this assumption, (i) the 
analogy with classical mechanics, which will be developed Jn* the 
next section, and (ii) we have H(t) appearing as ih times an operator 
of displacement in time similar to the operators of displacement in 
the x, y t and z directions of § 25, so corresponding to $59) of § 25 
we should have H(t) equal to the total energy, since the theory of 
relativity puts energy in the same relation to time as momentum to 
distance. 

We assume on physical grounds that the total energy of a system 
is always an observable. For an isolated system it is a constant, and 
may then be written H. Even when it is not a constant we shall often 
write jit simply H, leaving its dependence on t understood. If the 
energy depends on t, it means £he system is acted on by external 
forces. An aotioU^fef this kincKs to be distinguished from a distur- 
bance oqpsed bja^rocess of observation, as the former is compatible 
wjtb^ q^Lsalitjr and equations of mention while the latter is not. 

$ We can get a connexion between H(i) and the T of equation (1) 
by*substitut!ng for |ft> in (5) its value given by equation (1). This 

* * dT 

ih^±-\Pt 0 y^H(t)T\Pt 0 y. 


ffinoe lP^> may be a4Jr ket, ^e have H 


atfcft (5) !s V) 


problentt, where it is 
xm with' ^representation. Introducing a 
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representation with a complete set of computing observables £ 
diagonal and putting <£'| Pt) equal to ^(£7), we have, passing to the 


standard ket notation, 


i «> = m>- 


Equation (5) now becomes 


= zm>. 


(7) 


Equation^?) is known aa^chrodinger’s wave equation andits solutions y 
j^(£j)jure tirr&dependent loavejunctions^ Each solution corresponds to! 
a state of motion of the system and the square of its modulus gives 
the probability of the £’s having specified values at any time i^For 
a system dpscribable in terms of canonical coordinates and momenta 
we may use Schrodinger’s representation and can then take H to be 
an operator of differentiation in accordance with (42) of § 22. 


28. ititisenberg’s form for the equations of motion 

In the preceding section we set up# a picture of the states of 
undisturbdft motion by making each of them correspond to a moving 
ket, the state at any time corresponding to the ket at that time. Wp. 
shall call this the Schwinger picture. Let us apply to our kets the 
unitary transformation which makes each ket \a) go over into 

|a*> = ' (8) 

This transformation is of the form given by (75) of § 26 with T~ Y for 
U, but it depends on the time t since T depends on t. It is thus to be 
pictured as the application of a' continuous motion (consisting’ of 
rotations and uniform deformations) to the wholg ket vector space. 
A ket whioh is originally fixed becomeaa moving one, its motion being 
given by (8) with |a> independent of t. On the other^haftd, rf’let 
which is originally moving to correspond to a state of undftburbed 
motion, i.e. in accordance With equation (1), becomes fifed, since oi > 
substituting \Pt} for |o> in (8) w? get la*) independent of L Tfius ^ 
the tra^ormatidn brings the kets corresponding to states of undisturbed 
motion to rest . 

The unitary transformation must be applied also to type and linear 
operators, in orclfr$likt equations between thejpriatas quantities igyay. 
reiqfin invariants The transformation kppli&l fokras is given by the 
conjugate im^inary of (8) and apphedto linear" operators itia riven 
by (70) of § 26 with T~ x £qt'U f 

* ' ■ t • 1 . _ 
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i A linear operator which is originally fixed transforms into a moving 
linear operator in general. Now a dynamical variable corresponds to 
| a linear operator whioh is originally fixed (because it does not refer 
to t at all), so after the transformation.it corresponds to a moving 
linear operator. The transfonnation^thus leads jus to a newjpiotme 
of the^ motion, in w hioh t he states oor respond^ to fi xed vectors and 
fHe^ dynamioal variablgfB^ro moyingjinear operators. We s hall Mill 
this the Heisenhergjomjire . 

The physioal condition of the dynamical system at any time 
involves the relation of the dynamioal ‘ variables to the state, and 
the change of the physical condition with time may be ascribed 
either to a change in the state, with the dynamioal varia bles kept 
fixed, which gives us the Schrodinger picture, or to a change in the 
dynamical variables, with the state kept fixed, ~^&ioh gives us the 
Heisenberg pict ure. 

In the Heisenberg picture there are equations of motion for the 
dynamical variates. Take a dynamical variable corresponding to 
i)he fixed linear operator v in the Schrodinger picture. In the Heisen- 
berg picture it corresponds to a moving linear operator, which we 
write as v t instead of v*, to bring out its dependence on t, and which 
is given by v< = T~hiT f. T " l - \r (10) 

or Tv, = vT. 


Differentiating with respeot to t, ^ve get 

dT * , -dfr)* dT 

With the help of (6), this gives 

0 HTvt+iHT -vHT 

or T-htHT-T-'HTv, . 

at 1 




* t ^ H t 




— v,B,—H,v„ 

whxxi H, = T-'HT. 

Equation (11) may be written in P.B. notation 

'• r.. *n 


*H t = T-*H H-f- 
( 11 ) 
( 12 ) 
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Equation (11) or (13) shows how any dynamical variable varies 
with time in the Heisenberg picture and gives us Heisenber g's form 
for the eq uation s of motion . These equations of motion are determined 
by the one linear operator which is just the transform of the linear 
operator H occurring in Schrodinger’s form for the equations of 
motion and corresponds to the energy in the Heisenberg picture. We 
shall call the dynamical variables in the Heisenberg picture, where 
they vary with the time, Heisenberg dynamical Portables , to distinguish 
them from the fixed dynamical variables of the Sohrodinger picture, 
which we shall call Sch rodinger dynamical variables . Each Heisenberg 
dynamical variable is connected with the corresponding Schrodinger 
dynamical variable by equation (10). S ince th is connexion is a unitary 
transformation, all algebraic and functio&al rela tionships ar e the 
same for bo th k ind s of dynamical variable. We have T = 1 for 
t = t 0 , so that v ig = v and any Heisenberg dynamical variable at time 
t Q equ als the corresponding Schrodinger dynamical variable. 

Equation (13) can be compared with classical mechanics, where we 
also have dynamical variables varying with the time. The equations 
of motion of classical mechanics can be written in the Hamiltonian 


form 


dq r _ dH dp r _ dH 

dt ~~ dp r ’ dt dq r * 


(14) 


where the q’ s and p f s are a set of canonical coordinates and momenta 
and H is the energy expressed as a function of them and possibly also 
of t. The energy expressed in this way is called the Hamiltonian. 
Equations (14) give, for v any function 1 of the q’ s and p' s that does 
not contain the time t explicitly, 


dv _ ^ldvdq r ( dv dp r \ 
dt ~ £\dq r dt+dp r W) 



with the classical definition of a P.B., equation (1) of § 21. fhig is 
of the same form as equation (13) in th? quantum theory. We thus 
fet an analbgy between the classical equations of motion in the 
Hamiltonian form and the quantum equations of motion in Heisen- 
berg’s form. This analogy provides a justification for the assumption 

«M.|7 
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that the linear operator H introduced in the preceding section is the 
energy of the system in quantum mechanics. 

In classical mechanics a dynamical system is defined mathemati- 
cally when the Hamiltonian is given, i.e. when the energy is given 
in terms of a set of canonical coordinates and momenta, as this is 
sufficient to fix the equations of motion. In quantum mechanics a 
dynamical system is defined mathematically when the energy is 
given in terms of dynamical variables whose commutation relations 
are known, as this is then sufficient to fix the equations of motion, 
in both Schrodinger’s and Heisenberg’s form. We need to have 
either H expressed in terms of the Schrodinger dynamical variables 
or expressed in terms of the corresponding Heisenberg dynamical 
variables, ’{h^functionaT relationship being, of course, the same in 
bothcasesr We call the energy expressed in this way the Hamiltonian 
oF the dynamical system in quantum mechanics, to keep up~the 
analogy with the classical theory. 

A system in quantum mechanics always has a Hamiltonian, whether 
the system is one that has a classical analogue and is describable in 
terms of canonical coordinates and momenta or not. However, if the 
system does have a classical analogue, its connexion with classical 
mechanics is specially close and one can usually assume that the 
Hamiltonian is the same function of the canonical coordinates and 
momenta in the quantum theory as in the classical theory, f There 
would be a difficulty in this, of course, if the classical Hamiltonian 
involved a product of factors whose quantum analogues do not com- 
mute, as one would not know in which order to put these factors in 
the quantum Hamiltonian, but this does not happen for most of the 
elementary dynamical systems whose study is important for atomic 
physios. In oonsequence we are able also largely to use the same 
language for describing dynamical systems in the quantum theory as 
in the classical theory (e.g. to talk about particles with given masses 
moving through given fields of force), and when given a system in 
classical mechanics, can usually give a meaning to^the same’ system 
in quantum mechanics. 

Equation (13) holds for v t any function of tlie Heisenberg dynamical 
variables not involving the time explicitly, i.e, for v any constant 

t This assumption is found in practice to be successful only whan’hpplied with the 
dynamical coordinates and momenta referring to a Cartesian system of axes and not 
to more" general curvilinear coordinates. 
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ifnftiir operator in the Schrodinger picture. It shows that such a 
function v t is consta nt if it commutes w ith Hf or if v commutes with H . 
We then have „ __ « _ « 

v t — % — 


and we call v t oiva^ constantof the motion . It is necessary that v shall 
commute with H at all times, which is usually possible only if H is 
constant. In this case we can substitute H for v in (13) and deduce 
that H is constant, showing that H itself is then a constant of the 
motion. Thus if the Hamiltonian is constant in the Schrodinger 
picture, it is also constant in the Heisenberg picture. 

For an isolated system, a system not acted on by any external 
forces, there are always certain constants of the motion. One of these 
is the total energy or Hamiltonian. Others are provided by the 
displacement theory of § 25. It is evident physically that the total 
energy must remain unchanged if all the dynamical variables are 
displaced in a certain way, so equation (63) of § 25 must hold with 
v d = v = H. Thus D commutes with H and is a constant of the 
motion. Passing to the case of an infinitesimal displacement, we see 
that the displacement operators d x , d y , and d z are constants of the 
motion and hence, from (69) of § 25, the total momentum is a constant 
of the motion. Again, the total energy must remain unchanged if all 
the dynamical variables are subjected to a certain rotation. This 
leads, as will be shown in § 35, to the result that the total angular 
momentum is a constant of the motion. The laws of conservation of 
energy , momentum , and angular momentum hold for an isolated system 
in the Heisenberg picture in quantum mechanics , as they held in 
classical mechanics. 

Two forms for the equations of motion of quantum mechanics have 
now been given. Of these, the Schrodinger form is the more useful 
one for practical problems, as it provides the simpler equations. The 
unknowns in Schrodinger’s wave equation are the numbers which 
form the representative of a ket vector, while Heisenberg’s equation 
of motion for a dynamical variable, if expressed in terms of a repre- 
sentation, would involve as unknowns the numbers forming the 
representative of the dynamical variable. The latter are far more 
numerous and therefore more difficult to evaluate than the Schro- 
dinger unknowns. Heisenberg’s form for the equations of motion is 
of value in providing an immediate analogy with classical mechanics 
and enabling one to see how various features of classical theory, such 



THE EQUATIONS OF MOTIO-, 


116 


§28 


as the conservation laws referred to above, are translated into quan- 
tum theory. 

29. Stationary states 

We shall here deal with a dynamical system whose energy is con- 
stant. Certain specially simple relations hold for this case. Equation 
( 6 ) can be integrated! to give 

" T = e-izw-W*, 

with the help of the initial condition that T = 1 for t = t 0 . This 
result substituted into ( 1 ) gives 
I Pt> = 

which is the integral of Schrodinger’s equation of motion ( 6 )J and 
substituted into ( 10 ) it gives 

v t = e im-(o)ih V e-im-ic)ifi j ( 1 7) 

which is the integral of Heisenberg’s equation of motion ( 11 ), H t being 
now equal to H. Thus we have solutions of the equations of motion 
in a simple form. However, these solutions are not of much practical 
value, because of the difficulty involved in evaluating the operator 
unless ff is particularly simple, and for practical purposes 
one usually has to fall back on Sohrodinger’s wave equation. 

Let us consider a state of motion such that at time t 0 it is an eigen- 
state of the energy. The ket |PJ 0 > corresponding to it at this time 
must be an eigenket of H. If H' is the eigenvalue to which it belongs, 
equation ( 16 ) gives | P ^ _ e - i H'<i-U)ih\p to y j 

showing that \PV) differs from |Pf 0 > only by a phase factor. Thus 
the state always remains an eigenstate of the energy, and further, it 
doefl not vary with the time at all, since the direction of the ket \Pt) 
does not vary with tire time. Such a state is called a stationary state . 
The probability for any particular result of an observation on it is 
independent of the time when the observation is made. From our 
assumption that the energy is an observable, there are sufficient 
stationary states for an arbitrary state to be dependent on them. 

The time-dependent wave function ifs(£t) representing a stationary 
state of energy H* will vary with time according to the law 

<i>m = (is) 

t The integration ean be carried out as though H were an ordinary algebraic 
variable ^instead of a linear operator, because there is no quantity that does not 
coaam|its with H in the work. 
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and Schrodinger’s wave equation (7) for it reduces to 

H'fa = £ty 0 >. (19) 

This equation merely asserts that the state represented by ift 0 is an 
eigenstate oiH. We call a function 0 O satisfying (19) an eigenfunction 
of H, belonging to the eigenvalue H\ 

In the Heisenberg picture the stationary states correspond to fixed 
eigenvectors of the energy. We can set up a representation in which 
all the basic vectors are eigenvectors of the energy and so correspond 
to stationary states in the Heisenberg picture. We call such a repre - 
sentation a Heisenberg representation^ The first form of quantum 
mechanics, discovered by Heisenberg in 1925, was in terms of a 
representation of this kind. The energy is diagonal in the representa- 
tion. Any other diagonal dynamical variable must commute with the 
energy and is therefore a constant of the motion. The problem of 
setting up a Heisenberg representation thus reduces to the problem 
.of finding a complete set of commuting observables, each of which 
is a constant of the motion, and then making these observables 
diagonal. The energy must be a function of these observables, from 
Theorem 2 of § 19. It is sometimes convenient to take the energy 
itself as one of them. 

Let a denote the complete set of commuting observables in a 
Heisenberg representation, so that the basic vectors are written <a'|, 
la'). The energy is a function of these observables a, say H = H( a). 
From (17) we get 

<a>,|a'> = 

= e«*'-*^»<a>|a'>, (20) 

where H ' = H(oc) and H” = H(a'). The factor <a' |v|a'> on the right- 
hand side here is independent of t, being an element of the matrix 
representing the fixed linear operator v. Formula (20) shows how the 
Heisenberg matrix elements of any Heisenberg dynamical variable 
vary with time, and it makes v t satisfy the equation of motion (11), 
as is easily verified. The variation given by (20) is simply periodic 
with the frequency 

4£r-£T|/2 (21) 

depending only on the energy difference of the two stationary states 
to which the matrix element refers. This result is closely connected 
with the Combination Law of Spectroscopy and Bohr’s Frequency 
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Condition, according to which (21) is the frequency of the electro- 
magnetic radiation emitted or absorbed when the system makes a 
transition under the influence of radiation between the stationary 
states a! and a*, the eigenvalues of 27 being Bohr’s energy levels. 
These matters will be dealt with in § 45. 

30. The free particle 

The most fundamental and elementary application of quantum 
mechanics is to the system consisting merely of a free particle, or 
particle not acted on by any forces. For dealing with it we use as 
dynamical variables the three Cartesian coordinates x , y, z and their 
conjugate momenta p x , p v , p e . The Hamiltonian is equal to the 
kinetic energy of the particle, namely 

H = ~(JA+Pl+P\) ( 22 ) 

according to Newtonian mechanics, m being the mass. This formula 
is valid only if the velocity of the particle is small compared with c, 
the velocity of light. For a rapidly moving particle, such as we often 
have to deal with in atomic theory, (22) must be replaced by the 
relativistic formula 

H = c(m*c*+p\+pl+p\)*. (23) 

For small values oip xi p yf and p M (23) goes over into (22), except for 
the constant term me 2 which corresponds to the rest-energy of the 
particle in the theory of relativity and which has no influence on the 
equations of motion. Formulas (22) and (23) can be taken over 
directly into the quantum theory, the square root in (23) being now 
understood as the positive square root defined at the end of § 11. 
The constant term me 2 by which (23) differs from (22) for small values 
of p xf p y) and p M can still have no physical effects, since the Hamil- 
tonian in the quantum theory, as introduced in § 27, is undefined to 
the extent of an arbitrary additive real constant. 

We shall here work with the more accurate formula (23). We shall 
first solve the Heisenberg equations of motion. From the quantum 
conditions (9) of § 21, p x commutes with p y and p Bi and hence, from 
Theorem 1 of § 19 extended to a set of commuting observables, p x 
commutes with any function of p xt p yf and p M and therefore with 27. 
It follows that p x is a constant of the motion. Similarly p y and p 9 are 
constants of the motion. These results are the same as in the classical 



$30 THE FREE PARTICLE 119 

theory. Again, the equation of motion for a coordinate, x t say, is, 
according to (11), 

ihx t = = ^c(m^ a +^J+^5+i)|)*-c(m a c a +j)|+^+pJ)^. 

The right-hand side here can be evaluated by means of formula 
(31) of § 22 with the roles of coordinates and momenta interchanged, 

(24) 


(25) 


(26) 

Equations (25) and (26) are just the same as in the classical theory. 

Let us consider a state that is an eigenstate of the momenta, 
belonging to the eigenvalues p x} p' y , p e . This state must be an eigen- 
state of the Hamiltonian, belonging to the eigenvalue 

H' = c(m a c a +^ a +K 2 +^ 2 ) 1 , (27) 

and must therefore be a stationary state. The possible values for H' 
are all numbers from me 2 to oo, as in the classical theory. The wave 
function t/*(xyz) representing this state at any time in Schrodinger’s 
representation must satisfy 

= Px^ x V z )'> = 

with ftimilii.r equations for p y and p z . These equations show that 
*l*(xyz) is of the form 

i//(xyz) = aeW'WiV+PW*, (28) 

where a is independent of x , y, and z . From (18) we see now that the 
time-dependent wave function ifj(xyzt) is of the form 

\jt(xyzt) = a 0 (29) 

where a 0 is independent of x, y, z f and t. 

The function (29) of x } y } z, and t describes plane waves in spaoe- 
time. We see from this example the suitability of the terms ‘wave 
function’ and ‘wave equation’. The frequency of the waves is 

v = H'/A, 


so that it reads 


<lrf-hr = * 


f now being any function of the p’s. This gives 

%>x 


*i = ■^■c(mV+pl+pl+pl)i = c -j±. 


Similarly, yi — ^wr> = 


H ’ ** H ’ 


The magnitude of the velocity is * 

® = ^(pI+pI+pD'IH- 


( 30 ) 
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their wavelength is 

A = hl(p' x *+p' y *+p'*)* = h/P' 9 (31) 

P f being the length of the vector (Pxyp' v ,p' B ) t and their motion is in 
the direction specified by the vector (Px,p' v ,p' e ) with the velocity 

Xv = H'/P' = c a /v', (32) 

v r being the velocity of the particle corresponding to the momentum 
( Px>Pv>P i) gi ven by formula (26). Equations (30), (31), and (32) 
are easily seen to hold in all Lorentz frames of reference, the expres- 
sion on the right-hand side of (29) being, in fact, relativistically 
invariant with p x ,p y ,p z and H' as the components of a 4-vector. 
These properties of relativistic invariance led de Broglie, before the 
discovery of quantum mechanics, to postulate the existence of waves 
of the form (29) associated with the motion of any particle. They 
are therefore known as de Broglie waves. 

In the limiting case when the mass m is made to tend to zero, the 
classical velocity of the particle v becomes equal to c and hence, from 
(32), the wave velocity also becomes c. The waves are then like the 
light-waves associated with a photon, with the difference that they 
contain no reference to the polarization and involve a complex ex- 
ponential instead of sines and cosines. Formulas (30) and (31) are 
still valid, connecting the frequency of the light-waves with the 
energy of the photon and the wavelength of the light-waves with 
the momentum of the photon. 

For the state represented by (29), the probability of the particle 
being mund in any specified small volume when an observation of its 
position is made is independent of where the volume is. This provides 
an example of Heisenberg’s principle of uncertainty, the state being 
one for which the momentum is accurately given and for which, in 
consequence, the position is completely unknown. Suoh a state is, 
of course, a limiting case which never occurs in practice. The states 
usually met with in practice are those represented by wave packets, 
which may be formed by superposing a number of waves of the type 
(29) belonging to slightly different values of (pi,Py,pi), as discussed 
in § 24. The ordinary formula in hydrodynamics for the velocity of 
such a wave packet, i.e. the group velocity of the waves, is 

dv 

dm 


( 33 ) 
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which gives, from (30) and (31) 

AH' A i o o i D , 2U c a P' , 

dF' = c dP' {mc+P)= W = v - (34) 

This is just the velocity of the particle. The wave packet moves in 
the same direction and with the same velocity as the particle moves 
in classical mechanics. 

31. The motion of wave packets 
The result just deduced for a free particle is an example of a general 
principle. For any dynamical system with a classical analogue, a state 
for which the classical description is valid as an approximation is 
represented in quantum mechanics by a wave packet, all the co- 
ordinates and momenta having approximate numerical values, whose 
accuracy is limited by Heisenberg’s principle of uncertainty. Now 
Schrodinger’s wave equation fixes how such a wave packet varies with 
time, so in order that the classical description may remain valid, the 
wave packet should remain a wave packet and should move according 
to the laws of classical dynamics. We shall verify that this is so. 

We take a dynamical system having a classical analogue and let 
its Hamiltonian be H{q r ,p r ) (r = 1,2,..., n). The corresponding classi- 
cal dynamical system will have as Hamiltonian H e (q r , p r ) say, obtained 
by putting ordinary algebraic variables for the q r and p T in H(q ri p r ) 
and making h -> 0 if it occurs in H(q ri p r ). The classical Hamiltonian 
Hq is, of course, a real function of its variables. It is usually a 
quadratic function of the momenta p r , but not always so, the 
relativistic theory of a free particle being an example where it is not. 
The following argument is valid for H c any algebraic function of thep’s. 

We suppose that the time-dependent wave function in Schro- 
dinger’s representation is of the form 

i/,(qt) = Ae* 8 * (35) 

where A and 8 are real functions of the g’s and t which do not vary 
very rapidly with their arguments. The wave function is then of the 
form of waves, with A and 8 determining the amplitude and phase 
respectively. Schrodinger’s wave equation (7) gives 

= H(q rlPr )Ae m > 


or 


(36) 
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Now e-* 81 * is evidently a unitary linear operator and may be used for 
U in equation (70) of § 26 to give us a unitary transformation. The 
q’s remain unchanged by this transformation, each p r goes over into 
e ~iSlfip re iSjh _ p r +dS/dq ri 

with the help of (31) of § 22, and H goes over into 
e- ts l*H(q r ,p r )e < s ' * = H(q r ,p r +dSI8q r ), 

since algebraic relations are preserved by the transformation. Thus 
(36) becomes 

Let us now suppose that ft can be counted as small and let us neglect 
terms involving ft in (37). This involves neglecting the p r ’ s that occur 
in H in (37), since each p r is equivalent to the operator —iftd/dq r 
operating on the functions of the q’ s to the right of it. The surviving 

tenDS giY6 88 rrl 88\ 

-~di = He [ qr, WJ' () 

This is a differential equation which the phase function S has to 
satisfy. The equation is determined by the classical Hamiltonian 
function H 0 and is known as the Hamilton- Jacobi equation in classical 
dynamics. It allows 8 to be real and so shows that the assumption 
of the wave form (36) does not lead to an inconsistency. 

To obtain an equation for A, we must retain the terms in (37) 
which are linear in ft and see what they give. A direct evaluation of 
these terms is rather awkward in the case of a general function H, 
and we can get the result we require more easily by first multiplying 
both sides of (37) by the bra vector (Af, where / is an arbitrary real 
function of the q’ s. This gives 

The ‘conjugate complex equation is 

Subtracting and dividing out by ift, we obtain 

2 <^> - <4 [/. 4"* + t0P>- ' <M) 
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We now have to evaluate the P.B. 

[f,H{q r ,p r +8Sldq r )]. 

Our assumption that ft can be counted as small enables us to expand 
H(q r ,p r +8$ld<lr) as a power series in the p's. The terms of zero degree 
will contribute nothing to the P.B. The terms of the first degree in 
the p's give a contribution to the P.B. which can be evaluated most 
easily with the help of the classical formula (1) of § 21 (this formula \ 
being valid also in the quantum theory if w is independent of the p's j 
and v is linear in the p’s). The amount of this contribution is 

y a/[ aff(gr,Pr) l 

d Ps \ Pr^SldQr 

the notation meaning that we must substitute dS/dq r for each p r in 
the function [ ] of the q ’ s and p's, so as to obtain a function of the q ’ s 
only. The terms of higher degree in the p's give contributions to the 
P.B. which vanish when ft ->► 0. Thus (39) becomes, with neglect of 
terms involving ft , which is equivalent to the neglect of ft 2 in (37), 




>• 


(40) 


ip, =SSISq, 

Now if a(q) and 6(g) are any two functions of the q’s, formula 
(64) of§ 20 gives <o(#(4 )> _ J „ (jW jft.), 


and so 


dq r dq r 


(41) 


provided a(q) and b(q) satisfy suitable boundary conditions, as dis- 
cussed in §§ 22 and 23. Hence (40) may be written 


Since this holds for an arbitrary real function/, we must have 


&4 2 _ d 


i-FTL 


(42) 


]pr«d8ldq r > 

This is the equation for the amplitude A of the wave function. To 
get an understanding of its significance, let us suppose we have a fluid 
moving in the space of the variables q, the density of the fluid at any 
point and time being A 1 and its velocity being 

• (43) 

at l dp B }p r -d8ldq r 
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Equation (42) is then just the equation of conservation for such a 
fluid. The motion of the fluid is determined by the function S 
satisfying (38), there being one possible motion for each solution 
of (38). 

For a given S, let us take a solution of (42) for which at some 
definite time the density A 2 vanishes everywhere outside a certain 
small region. We may suppose this region to move with the fluid, 
its velocity at each point being given by (43), and then the equation 
of conservation (42) will require the density always to vanish outside 
the region. There is a limit to how small the region may be, imposed 
by the approximation we made in neglecting ft in (39). This approxi- 
mation is valid only provided 


» d A ^ dS A 

h *Z A <?Z A ’ 
Hr oq f 


or 


A Vqrftdq r ' 


which requires that A shall vary by an appreciable fraction of itself 
only through a range of the g’s in which 8 varies by many times ft, 
i.e. a range consisting of many wavelengths of the wave function (35). 
Our solution is then a wave packet of the type discussed in § 24 and 
remains so for all time. 

We thus get a wave function representing a state of motion for 
which the coordinates and momenta have approximate numerical 
values throughout all time. Such a state of motion in quantum 
theory corresponds to the states with whioh classical theory deals. 
The motion of our wave packet is determined by equations (38) and 
(43). From these we get, defining p a as dS/dq a , 

dp a d 0S _ d*S y d * s ‘ 
dt dtdq^ dtdq a Z,dq u dq a dt 


n* \ &Pu 


^ ^HgfarfPr) * 

Bq a 9 

where in the last line the p’s are counted as independent of the g’s 
before the partial differentiation. Equations (43) and, (44) are just 
the classical equations of motion in Hamiltonian form and show that 
the wave paoket moves according to the laws of classical mechanics. 
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We see in this way how the classical equations of motion are derivable 
from the quantum theory as a limiting case. 

By a more accurate solution of the wave equation one can show 
that the accuracy with which the coordinates and momenta simul- 
taneously have numerical values cannot remain permanently as 
favourable as the limit allowed by Heisenberg’s principle of un- 
certainty, equation (66) of § 24, but if it is initially so it will become 
less favourable, the wave packet undergoing a spreading.! 

32. The action principle]; 

Equation (10) shows that the Heisenberg dynamical variables at 
time t, v () are connected with their values at time t 0 , v t9 , or v, by a 
unitary transformation. The Heisenberg variables at time t+ht are 
connected with their values at time t by an infinitesimal unitary 
transformation, as is shown by the equation of motion (11) or (13), 
which gives the connexion between v t+ % and v t of the form of (79) or 
(80) of § 26 with for F and 8 t/ft for €. The variation with time of 
the Heisenberg dynamical variables may thus be looked upon as the 
continuous unfolding of a unitary transformation. In classical 
mechanics the dynamical variables at time £+8$ are connected with 
their values at time t by an infinitesimal oontact transformation and 
the whole motion may be looked upon as the continuous unfolding of a 
contact transformation. We have here the mathematical foundation 
of the analogy between the classical and quantum equations of 
motion, and can develop it to bring out the quantum analogue of all 
the main features of the classical theory of dynamics. 

Suppose we have a representation in which the complete set of 
commuting observables £ are diagonal, so that a basic bra is <£'|. 
We can introduce a second representation in which the basic bras are 
<£'*l = <fl T. (46) 

The new basic bras depend on the time t and give us a moving 
representation, like a moving system of axes in an ordinary vector 
space. Comparing (46) with the conjugate imaginary of (8), we see 
that the new basic vectors are just the transforms in the Heisenberg 
pioture of the original basic vectors in the Schrodinger picture, and 
hence they must be connected with the Heisenberg dynamical 

t See Kennard, Z.f. PM^ik, 44 (1927), 344 ; Darwin, Proo. Roy. Soe . A, 117 (1927), 
258. 

% This section may be omitted by the student who is not specially concerned with 
hi glwy dynamics. 
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variables v t in the same way in which the original basic vectors are 
connected with the Schrodinger dynamical variables v. In particular, 
each <£'*| must be an eigenvector of the f/s belonging to the eigen- 
values It may therefore be written <£*|, with the understanding 
that the numbers are the same eigenvalues of the £/s that the f M s 
are of the £’s. From (45) we get 

<610 = <ff \T\r>, (46) 

showing that the transformation function is just the representative 
of T in the original representation. 

Differentiating (45) with respect to t and using (6), we get 

«g<6i = <r i bt = 

with the help of (12). Multiplying on the right by any ket |a> 
independent of t, we get 

<(l l°> = <6 W» = J <61316) <6k»>. (47) 

if we take for definiteness the case of continuous eigenvalues for the 
f’s. Now equation (5), written in terms of representatives, reads 

a 5 <r ia> = J <riffio *e <r\pt>. m 

Since is the same function of the variables & and £ that 

<f'\HI(*> is of £' and equations (47) and (48) are of precisely the 
same form, with the variables in (47) playing the role of the 
variables £' and £* in (48) and the function <#|a> playing the role 
of the function <f'|P$>. We can thus look upon (47) as a form of 
Schrodinger’s wave equation, with the fimotion <£J| a) of the variables 
£t as the wave function. In this way Schrodinger’s wave egvaHon 
appears in a ne w light, as the condition on the representati ve , in the 
moving representation with the Heisenberg variables £ t diagonal , of t he 
fixed ket corresponding to a state in the H eisenb ergyictwre. The function 
<£ ( '|a> owes its variation with time to its left factor <$|, in contra- 
distinction to the function <£' |P£>, which owes its variation with time 
to its right faotor | Pty. 

If we put |a> = |£"> in (47), we get 

«|<6 if > = J <6ianr> #r <Z\C>: 


(49) 
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showing that the transformation function satisfies Schro- 

dinger’s wave equation. Now ^ = £, so we must have 

<f fc ir> = 8(&-n (50) 

the 8 function here being understood as the product of a number of 
factors, one for each ^-variable, such as occurs for the variables 
£ v +i,..,£ u on the right-hand side of equation (34) of § 16. Thus the 
transformation function (&!£*) is that solution of Schrodinger’s wave 
equation for which the f’s certainly have the values £” at time t 0 
The square of its modulus, |<£*|OI 2 > is the relative probability of the 
f s having the values £' t at time t > t 0 if they certainly have the values 
£” at time t Q . We may write as <^1^) and consider it as 

depending on t 0 as well as on t. To get its dependence on t Q we take 
the conjugate complex of equation (49), interchange t and t 0 and also 
interchange single primes and double primes. This gives 

<sift> = J <&\o < <mto- (si) 

The foregoing discussion of the transformation function <£[0 is 
valid with the £’s any complete set of commuting observables. The 
equations were written down for the case of the £’s having continuous 
eigenvalues, but they would still be valid if any of the f’s have 
discrete eigenvalues, provided the necessary formal changes are made 
in them. Let us now take a dynamical system having a classical 
analogue and let us take the f’s to be the coordinates q. Put 

<tflaD = (52) 

and so define the function S of the variables q' t , q”. This function also 
depends explicitly on t. (52) is a solution of Schrodinger’s wave 
equation and, if ft can be counted as small, it can be handled in the 
same way as (35) was. The 8 of (52) differs from the 8 of (35) on 
account of there being no A in (52), which makes the 8 of (52) com- 
plex, but the real part of this 8 equals the 8 of (35) and its pure 
imaginary part is of the order ft. Thus, in the limit ft -> 0, the 8 of 
(52) will equal that of (35) and will therefore satisfy, corresponding 

to < 38 )’ -aS/« = (53) 

▼here p’ H — dS/dq'rt, (54) 

and Hg is the Hamiltonian of the classical analogue of our quantum 
dynamical system. But (52) is also a solution of (51) with q’a for £’s, 
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which is the conjugate complex of Schrodinger’s wave equation in the 
variables q* or gj,. This causes S to satisfy alsof 

dS/dt 0 = HMrfr)* (65) 

where p’ r = —dS/dtf.. (50) 

The solution of the Hamilton-J acobi equations (53), (55) is the 
action function of classical mechanics for the time interval t 0 to t, 
i.e. it is the time integral of the Lagrangian L, 

t 

8= f L(t') dt'. (57) 

Thus the S defined by (52) is the quantum analogue of the classical action 
function and equals it in the limit h-> 0. To get the quantum analogue 
of the classical Lagrangian, we pass to the case of an infinitesimal 
time interval by putting t = t 0 +ht and we then have <gj 0+ s/|?l > > as the 
analogue of e iUM ^ h . For the sake of the analogy, one should consider 
L(t 0 ) as a function of the coordinates q' at time t 0 +bt and the co- 
ordinates q” at time t 0 , rather than as a function of the coordinates 
and velocities at time t 0 , as one usually does. 

The principle of least action in classical mechanics says that the 
action function (57) remains stationary for small variations of the tra- 
jectory of the system which do not alter the end points, i.e. for small 
variations of the q ' s at all intermediate times between t 0 and t with q u 
and q t fixed. Let us see what it corresponds to in the quantum theory. 

Put expjt J* L(t) dtjh j = exp {iS(t b ,t a )lft} = B(t bi t a ), (58) 

so that B(t bt t a ) corresponds to <gjJgJ,> h the quantum theory. (We 
here allow gj # and g^ to denote different eigenvalues of q tm and g^, to 
save having to introduce a large number of primes into the analysis.) 
Now suppose the time interval t 0 -* t to be divided up into a large 
number of small time intervals t 0 -> t v t x -> t m , t m t } by 

the introduction of a sequence of intermediate times t ly f t ,..., t m . Then 

*o) == ®($> *m)-®(*m> *m-l)"-®(*i> *o)* (5®) 

The corresponding quantum equation, which follows from the pro- 
perty of basic vectors (35) of § 16, is 

J - (00) 

t, For a more aoourate comparison of transformation Amotions with olassioal 
theory, see Van Vleek, Proe, Nat* Acad. 14, 178. 
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q k being written for q^ for brevity. At first sight there does not seem 
to be any close correspondence between (59) and (00). We must, 
however, analyse the meaning of (59) rather more carefully. We must 
regard each factor B as a function of the q ' s at the two ends of the 
time interval to which it refers. This makes the right-hand side of 
(59) a function, not only of q t and q u , but also of all the intermediate 
q* s. Equation (59) is valid only when we substitute for the inter- 
mediate q’s in its right-hand side their values for the real trajectory, 
small variations in which values leave 8 stationary and therefore also, 
from (58), leave B(t, t 0 ) stationary. It is the prooess of substituting 
these values for the intermediate q ’ s which corresponds to the inte- 
grations over all values for the intermediate q ” s in (60) . The quantum 
analogue of the action principle is thus absorbed in the composition 
law (60) and the classical requirement that the values of the inter- 
mediate q y s shall make 8 stationary corresponds to the condition 
in quantum mechanics that all values of the intermediate q ” b 
are important in proportion to their contribution to the integral 
in (60). 

Let us see how (59) can be a limiting case of (60) for ft small. We 
must suppose the integrand in (60) to be of the form e^*, where F is 
a function of which remains continuous as ft tends 

to zero, so that the integrand is a rapidly oscillating function when 
ft is small. The integral of such a rapidly oscillating function will be 
extremely small, except for the contribution arising from a region in 
the domain of integration where comparatively large variations in 
the q k produce only very small variations in F. Such a region must 
be the neighbourhood of a point where F is stationary for small varia- 
tions of the q' k . Thus the integral in (60) is determined essentially by 
the value of the integrand at a point where the integrand is stationary 
for small variations of the intermediate q” s, and so (60) goes over 
into (59). 

Equations (54) and (56) express that the variables qf u p\ are con- 
nected with the variables q”,p* by a contact transformation and are 
one of the standard forms of writing the equations of a contact trans- 
formation. There is an analogous form for writing the equations of a 
unitary transformation in quantum mechanics. We get from (52), with 
the help of (45) Of § 22, 

<Mf> = («i) 




E 
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Similarly, with the help of (40) of § 22, 

t <rib,l50 = (02) 

From the general definition of functions of commuting observables, 
we have <&\fiabM\T> = (63) 

where f(q t ) and g(q) are functions of the qj s and g’s respectively. Let 
Q{qi,q) be any funotion of the q t 9 s and q’ s consisting of a sum or 
integral of terms each of the form f(q t )g(q ), so that all the q t \ s in O 
occur to the left of all the q* s. Such a function we call well ordered . 
Applying (03) to each of the terms in 0 and adding or integra ting , 

we get <?; i G(<&. ?)is'> = 

Now let us suppose each p^ and p r can be expressed as a well-ordered 
funotion of the q,’a and q’ s and write these functions pjq„ q),p T (q i , q). 
Putting these functions for O, we get 

<2*brl2'> =P,(2i.2')<2«l2'>- 

Comparing these equations with (01) and (02) respectively, we see 
that 


pMY>- s -^p 

This means that 

dSfaq) 
&Ih 


Prti.U<f) « 


g Wi 


Wr 


Pr<- 


Pr 


= 

Sgr * 


(04) 


provided the right-hand sides of (04) are written as well-ordered 
functions. 

These equations are of the same form as (54) and (50), but refer to 
the non-commuting quantum variables q t , q instead of the ordinary 
algebraic variables g£, q”. They show how the conditions for a unitary 
transformation between quantum variables are analogous to the condi- 
tions for a contact transformation between classical variables. The 
analogy is not oomplete, however, because the classical S must be real 
and there is no simple condition corresponding to this for the 8 of (04). 

33. The Gibbs ensemble 

In our work up to the present we have been aHmiming all along that 
our dynamical system at eaoh instant of time is ii^a definite state, 
that is to say, its motion is specified as completely and accurately as 
is possible without conflicting with the general principles of the theory. 
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In the classical theory this would mean, of course, that all the coordi- 
nates and momenta have specified values. Now we may be interested 
in a motion whioh is specified to a lesser extent than this l yATinmim 
possible. The present section will be devoted to the methods to be 
used in such a case. 

The procedure in classical mechanics is to introduce what is called 
a Gibbs ensemble, the idea of which is as follows. We consider all the 
dynamical coordinates and momenta as Cartesian coordinates in a 
certain space, the phase space , whose number of dimensions is twice 
the number of degrees of freedom of the system. Any state of the 
system can then be represented by a point in this space. This point 
will move according to the classical equations of motion (14). Sup- 
pose, now, that we are not given that the system is in a definite state 
at any time, but only that it is in one or other of a number of possible 
states according to a definite probability law. We should then be 
able to represent it by a fluid in the phase space, the mass of fluid in 
any volume of the phase space being the total probability of the 
system being in any state whose representative point lies in that 
volume. Each particle of the fluid will be moving according to the 
equations of motion (14). If we introduce the density p of the fluid 
at any point, equal to the probability per unit volume of phase space 
of the system being in the neighbourhood of the corresponding state, 
we shall have the equation of conservation 



- - m . m 

This may be considered as the equation of motion for the fluid, since 
it determines the density p for all time if p is given initially as a 
function of the q * s and p’s. It is, apart from the minus sign, of the 
same form as the ordinary equation of motion (15) for a dynamical 
variable. 

The requirement that the total probability of the system being in 
any state shall be unity gives us a normalizing condition for p 

jjpdqdp^l, (66) 

the integration being over the whole of phase space and the single 
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differential dq or dp being written to denote the product of all the 
dg’s ot dp’s. If J3 denotes any function of the dynamical variables, 
the average value of jS will be 

^--jjppdqdp. ( 67 ) 

It makes only a trivial alteration in the theory, but often facilitates 
discussion, if we work with a density p differing from the above one 
by a positive constant factor, k say, so that we have instead of (66) 

JJ pdqdp = k. 

With this density we can picture the fluid as representing a number 
k of similar dynamical systems, all following through their motions 
independently in the same place, without any mutual disturbance or 
interaction. The density at any point would then be the probable or 
average number of systems in the neighbourhood of any state per unit 
volume of phase space, and expression (67) would give the average 
total value of fi for all the systems. Such a set of dynamical systems, 
which is the ensemble introduced by Gibbs, is usually not realizable 
in practice, except as a rough approximation, but it forms all the 
same a useful theoretical abstraction. 

We shall now see that there exists a corresponding density p 
in quantum mechanics, having properties analogous to the above. 
It was first introduced by von Neumann. Its existence is rather 
surprising in view of the fact that phase spaoe has no meaning in 
quantum mechanics, ttere being no possibility of assigning numerical 
values simultaneously to the q’s and p’s. 

We consider a dynamical system which is at a certain time in one 
or other of a number of possible states according to some given 
probability law. These states may be either a discrete set or a con- 
tinuous range, or both together. We shall here take for definiteness 
the case of a discrete set and suppose them labelled by a parameter m. 
Let the normalized ket vectors corresponding to them be |m> %nd let 
the probability of the system being in the mth state be P m . We then 
define the quantum density p by 

P-l \m>P m <m\. (68) 

m 

Let p' be any eigenvalue of p and |p'> an eigenket belonging to this 
eigenvalue. Then . 

2 K> p »< m l/»'> — p\p'> — p'\p’> 

ffl 
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2 <p'|mfP m <m|/>'> = p'<p'|p'> 

m 

or 2P m |<j»|p'>|* = pVIA 

m 

Now P m> being a probability, can never be negative. It follows that 
p cannot be negative. Thus p has no negative eigenvalues, in analogy 
with the fact that the classical density p is never negative. 

Let us now obtain the equation of motion for our quantum p. In 
Schrodinger’s picture the kets and bras in (68) will vary with the time 
in accordance with Schrddinger’s equation (5) and the conjugate 
imaginary of this equation, while the P m ’s will remain constant, since 
the system, so long as it is left undisturbed, cannot change over from 
a state corresponding to one ket satisfying Schrodinger’s equation to 
a state corresponding to another. We thus have 

= 2 | - |rn>P m <m \S) 

m k 

= Hp-pH. (69) 

This is the quantum analogue of the classical equation of motion 
(65). Our quantum p, like the classical one, is determined for all time 
if it is given initially. 

From the assumption of § 12, the average value of any observable 
P when the system is in the state m is <m|/3|m>. Hence if the system 
is distributed over the various states m according to the probability 
law P m , the average value of p will be J P m (m\p\m). If we introduce 

m 

a representation with a discrete set of basic ket vectors |f '> say, this 
equals 

T P m <m|f><f |j8|m> = y <f|j3|m>P m <ro|f> 
fm 

=^<rij5pif>=^<fipj9i p>; 

the last step being easily verified with the law of matrix multiplica- 
tion, equation (44> of § 17. The expressions (70) are the analogue of 
the expression (67) of the classical theory. Whereas in the classical 
theory we have to multiply j8 by p and take the integral of the 
product over all phase space, in the quantum theory we have to 
multiply p by p, with the factors in either order, and take the 


T (70) 
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diagonal sum of the product in a representation. If the representa- 
tion involves a continuous range of basic vectors \£'), we get instead 


of (70) 
% 




(71) 


so that we must carry through a process of ‘ integrating along the 
diagonal’ instead of summing the diagonal elements. We define 
(71) to be the diagonal sum of /? p in the continuous case. It can ea sily 
Jbe^ verified, from the properties of transformation functions (56) of 
§ 18, that the diagonal sum is the same for all representations. 

Prom the condition that the |ra)’s are normalized we get, with 
discrete f"s * 



since the total probability of the system being in any state is unity. 
This is the analogilMlf equation (66). The probability of the system 
being in the state f', or the probability of the observables f which 
are diagonal in the representation having the values f ', is, according 
to the rule for interpreting representatives of kets (51) of § 18, 


i«»re. - <s i»if>. m 

which gives us a meaning for each term in the sum on the Jeffc-hand 
side of (72). For continuous ( "a, the right 3iand side of (73) gives the 
pri^ability of the £’s having values in the neighbourhood of £' per 
unit range of variation of the values £'. 

*As in the dlassioal theory, we may take a density equal to k times 
the above p and consider it as representing a Gibbs ensemble of k 
similar dynamical systems, between which there is no mutual dis- 
turbance or interaction. We shall then have k on the right-hand side 
of (7.2), and (70) or (71) will give the total average for all the 
members of the ensemble, while (73) will give the total probability 
of. a member of the ensemble having values for its f’s equal to £' 
or in the neighbourhood of £' per unit range of variation of the 
values #, 

An important application of the Gibbs ensemble is to a dynamical 
system in thermodynamic equilibrium with its surroundings at a 
given temperature T. Gibbs showed that suoh a system is repre- 
sented in classical mechanics by the density ** 


p = ear*/** 


(74) 
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H being the Hamiltonian, which is now independent of the time, k 
being Boltzmann’s constant, and c being a number chosen to make 
the normalizing condition (66) hold. This formula may be taken oyer 
unchanged into jthe quan tum theory. At high temperatures, f74) 
becomes p = c, which gives, onlbeing substituted into the right-hand 
side of (73), c<f'|f'> = c in the case of discrete £"s. This shows that 
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34. The harmonic oscillator 

A simple and interesting example of a dynamical system in quantum 
mechanics is the harmonic oscillator. This example is of importance 
for general theory, because it forms a corner-stone in the theory of 
radiation. The dynamical variables needed for describing the system 
are just one coordinate q and its conjugate momentum p. The 
Hamiltonian in classical mechanics is 

H = ( 1 ) 


where m is the mass of the oscillating particle and to is 2rr times the 
frequency. We assume the same Hamiltonian in quantum mechanics. 
This Hamiltonian, together with the quantum condition (10) of § 22, 
define the system completely. 

The Heisenberg equations of motion are 

q, = [q lt H] = pjm, 

Pi = [><>#] = -mo>*qt. 

It is convenient to introduce the dimensionless complex dynamical 


( 2 ) 


variable ^ 

The equations of motion (2) give 


(3) 


7f t = (2mfoi>)-*(— mct>*q ( +iwp t ) = 

This equation can be integrated to give 

Vi = Vo e<alt > ( 4 ) 

where tj 0 is a linear operator independent of t, and is equal to the 
valub of i ^ time t = 0. The above equations are all as in the 
classical theory. 

We ban express q and p in terms of rj and its conjugate complex rj 
and may thus work entirely in term%of rj and rj. We have 
ficDrjrj ass (2m)~ 1 {p-jr%majq)(p — inuoq) 

= (2m)“ l [p l +m l a>V+* nk * , (2P” M)] 

- m ( 5 ) 

and similarly Juorjrf = H+\tua. (6) 

Thus . yy—yy = l* (7) 
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Equation (5) or (6) gives H in terms of rj and rj and (7) gives the 
commutation relation connecting rj and rj. From (5) 

hwrjrjrj — rjH — ^hwrj 
and from (6) hwrjrjrj = Hrj-\-^hojfj. 

Thus rjH—Hrj — hwrj . (8) 

Also, (7) leads to rjrj n —r) n rj = nrj n ~ l (9) 

for any positive integer n, as may be verified by induction, since, by 
multiplying (9) by rj on the left, we can deduoe (9) with n+1 for n. 

Let H ' be an eigenvalue of H and | £T> an eigenket belonging to it. 
From (5) 

hw(H'\r } rj\H''> = <H'\H-ihu>\H f y = 

Now (H'lrjrjlH') is the square of the length of the ket rj\H'}, and 
h6n0e <H'\r,rj\H’> > 0, 

the case of equality occurring only if rj\H'} — 0. Also (H'\H ,S ) > 0. 

Thus J T > ihcj, (10) 


the case of equality occurring only if rj\ H') = 0 . From the form (I) 
of H as a sum of squares, we should expect its eigenvalues to be all 
positive or zero (since the average value of H for any state must be 
positive or zero.) We uow have the more stringent condition (10). 

From (8) 


Hrj\H'> = ( rjH-hwrj)\H'> = (H'-hw)rj\H , '>. (11) 

Now if W ^ \Hu>, rj\H’} is not zero and is then according to (11) an 
eigenket of H belonging to the eigenvalue H'—hw. Thus, with H' 
any eigenvalue of H not equal to H'—hw is another eigenvalue 

of H. We can repeat the argument and infer that; if H'—hw ^ \ha), 
H'—2Kw is another eigenvalue of H. Continuing in this way, we 
obtain the series of eigenvalues H\H'—hw,H'—2hw,H'--3fttt) f ..., 
which cannot extend to infinity, because then it would oontain eigen- 
values contradicting (10), and can terminate only with the value \hw. 
Again, from the conjugate complex of equation (8) 

Hrj\H’> = ( rjH+hwrj)\H’'> = (H 1 +hw)rj\H , '} i 
showing that H'+ftw is another eigenvalue of H, with rj\H'} as an 
eigenket belonging to it, unless rj\H'} = 0. The latter alternative 
can be ruled out, since it would lead to 

0 = hwrjr J \H , y = (£+Jfc«>)l#'> = (#'+#a>)l#'>, 
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which contradicts (10). Thus H'+fuo is always another eigenvalue 
of 27, and so are £ H f +3hoj and so on. Hence the eigenvalues 
of 17 are the series of numbers 

ihco, jfeo, $ha), lho>, .... (12) y 


extending to infinity. These are the possible energy values for the 
harmonic oscillator. 

Let |0> be an eigenket of 27 belonging to the lowest eigenvalue 
ihw, so that 


m - 0 , 


(13) 


and form the sequence of kets 


|0>, ^(0), ^|0>, 77*10), .... (14) 

These kets are all eigenkets of 27, belonging to the sequence of eigen- 
values (12) respectively. From (9) and (13) 

* W|0> = »i}*- l |0> (15) 

for any non-negative integer n. Thus the set of kets (14) is such that 
r) or rj applied to any one of the set gives a ket dependent on the set. 
Now all the dynamical variables in our problem are expressible in terms 
of rj and rj, so the kets (14) must form a complete set (otherwise there 
would be some more dynamical variables). There is just one of these 
kets for each eigenvalue (12) of 27, so 27 by itself forms a complete 
commuting set of observables. .The kets (14) correspond to the various 
stationary states of the oscillator. The stationary state with energy 
(n-fjf)^^ cor^pondinyg to ^ n |0>, is called the nth quantum state. 

The square of the length of the ket 97 n |0> is 

<01^*10) = n<0|^-V _1 |0> 

with the help Of (15). By induction, we find that 

<0|ipv|0> =*n\ (10) 

provided |0> is normalized. Thus the kets (14) multiplied by the 
coefficients nH with n == 0, 1, 2,..., respectively form the basio kets 
of a represembtion, namely the representation with 27 diagonal. Any 
can be ex|ian<Mil in the form 

l*>-|*n^|0>, (17) 

where the x n 9 s are numbers. In this way the ket |g> is put into 
correspondence with a power series %x n in the variable 77, the 
various . terms in the power series corresponding to the various 
stationary states. If |«> is normalized, it defines a state for whioh 
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the probability of the osoillator being in the nth quantum state, 
i.e. the probability of H having the value (n+tyftw, is 

Pn = n\\z n r, (18) 

as follows from the same argument which led to (51) of § 18. 

We may consider the ket |0> as a standard ket and the power series 
in r) as a wave function, since any ket can be expressed as suoh a 
wave function multiplied into this standard ket. The present kind 
of wave function differs from the usual kind, introduced by equations 
(62) of § 20, in that it is a function of the complex dynamical variable 
rj instead of observables. It is, however, for many purposes the most 
convenient wave function to use for describing states of the harmonic 
oscillator. The standard ket |0> satisfies the condition (13), which 
replaces the conditions (43) of § 22 for the standard ket in Schro- 
dinger’s representation. * 

Let us introduce Schrodinger’s representation with q diagonal and 
. obtain the representatives of the stationary states. From (13) and (3) 
(p—imcjq) | 0 > = 0 , 
so (q'lp—imwqioy = 0. 

With the help of (45) of § 22, this gives 

*i<g'l°>+»»^'<?'l 0 > = 0 - U») 

dq 

The solution of this differential equation is 

<g'|0> = (mcD/7r^) i e” T7K ^ ,/24 , (20) 

the numerical coefficient being chosen so as to make |0> normalized. 
We have here the representative of the normal state, as the state of 
lowest energy is called. The representatives of the other stationary 
states can be obtained from it. We have from (3) 

<JV|0> = (2m^)- n /®<g'|(p+ii»a)^) n |0> 

= ( 21 ) 

This may easily be worked out for small values of n. The result is of 
the form of e-****!™ times a power series of degree n in q'. A further 
factor nl~* must be inserted in (21) to get the normalized representa- 
tive of the nth quantum state. The factor i n may be discarded, being 
merely a phase factor. 
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35. Angular momentum 

Let us consider a particle described by the three' Cartesian coordi- 
nates x , y, z and their conjugate momenta p xi p y , p g . Its angular 
momentum about the origin is defined as in the classical theory, by 

= yPz-zPv , ™ v = zp x -xp e m t = xp v -yp xi (22) 
or by the vector equation 

m = xxp. 

We must evaluate the P.B.s of the angular momentum components 
with the dynamical variables x, p x , etc., and with each other. This 
we can do most conveniently with the help of the laws (4) and (5) of 
§ 2l, thus 

[m t ,x] = [xpy—yp x , <r] = -y[jp x ,x\ = y, 

K.y] = [xp v -ypx>y] = Ajp v ,y\ = — 

K,Z] = [xp y -yp x ,z] = 0, 

and similarly, 

[ m ,.Px] = p„. K.p v ] = -Px. 

K>z>«] = o, 

with corresponding relations for m x and m y . Again 

[«*». = A.Px. 

= -zp„+m = m x , 

[m„mj = m y , [m x , m y ] = m z . 

These results are all the same as in the classical theory. The sign in 
the results (23), (25), and (27) may easily be remembered from the 
rule that the + sign occurs when the three dynamical variables, con- 
sisting of the two in the P.B. on the left-hand side and the one 
forming the result on the right, are in the cyclio order (xyz) and the 
— sign occurs otherwise. Equations (27) may be put in the vector 

form mxm = ihm. (28) 

Now suppose we have several particles with angular momenta 

m^ms Each of these angular momentum vectors will satisfy 

(28), thus n^xm, = Mm?, 

and any one of them will oozdtnute with any other, so that 
m f xm ) -)-m,xm r = 0 (r#«). 


j (23) 

(24) 

(25) 

(26) 

j (27) 
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Hence if M = 2 “V “ the total angular momentum, 
r 

MxM = l^xm, = £ m r xm r +^(m r xm J +m,xm r ) 

= i«2mr = *'*M. (29) 

r 

This result is of the same form as (28), so that the components of the 
total angular momentum M of any number of particles satisfy the ! 
same commutation relations as those of the angular momentum of 
a single partiole. 

Let A x , A yf A z denote the three coordinates of any one of the 
particles, or else the three components of momentum of one of 
the particles. The A’b will commute with the angular momenta of 
the other particles, and hence from (23), (24), (25), and (26) 

[J K A x ] = A y , [M z , A v ] = -A x , [M t , A z \ = 0. (30) 

If B x , By, B z are a second set of three quantities denoting the 
coordinates or momentum components of one of the particles, they 
will satisfy similar relations to (30). We shall then have 

[M, t A x B x +AyBy+A z B z ] 

= [M z , A X ]B X +A X [M Z , B X ]+[M Z , A y ]B y +A y [M z B y ] 

= AyB x +A x By-A x B y -A y B x 

= 0 . , 

Thus the scalar product A x B x +A y B y +A z B z commutes with M t , 
aid similarly with M x and My. Introduce the vector product 

AxB = C 
or 

Ay B z —A z By = C x , A z B x -A x B z = C v , A x B y -Ay B x = C z . 

We have [M,, C x ] = -A x B z +A z Bj = G y 

and similarly [M z , C y ] == —C x , [M t ,C z ] — 0. 

These equations are again of the form (30)^with C for A. We can 
conclude from this work that equations of the form (30) hold for the 
three components of any vector that we can construct from our 
dynamical variables, and that any scalar commutes with M. < 
We can introduce linear operators R referring ‘to rotations about 
the origin in the same way in which we fatroduced the linear operators 
D in § 25 referring to displacements. Taking a rotation through an 
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angle &f> about the z-axis and making 8<f> infinitesimal, we can obtain 
the limit operator corresponding to (64) of § 25, 

fim(2*-l)/ty, 

which we shall call the rotation operator about the 2 -axis and denote 
by r g . Like the displacement operators, r K is a pure imaginary linear 
operator and is undetermined to the extent of an arbitrary additive 
pure imaginary number. Corresponding to (66) of § 25, the change 
in any dynamical variable v caused by a rotation through a small 
angle Sty about the 2 -axis is 

ty(r 0 t>-w,), (31) 

to the first order in &f>. Now the changes produced in the three 
components A x , A y , A z of a vector by a (right-handed) rotation h(f> 
about the 2 -axis applied to all measuring apparatus are 8<f>A yi 
—&I>A X) and 0 respectively, and any scalar quantity is unchanged by 
the rotation. Equating these changes to (31), we find that 


— Ay, Tg, ■Ay AyT z — A x , 
r z A z —A z r z = 0, 

*and r u commutes with any scalar. Comparing these results with (30), 
we see that ihr M satisfies the same commutation relations as M z . 
Their difference, M z —ihr zi commutes with all the dynamical variables 
and must therefore be a number. This number, which is necessarily 
real since M z and ibr z are real, may be made zero by a suitable choice 
of the arbitrary pure imaginary number that can be added to r z . We 
then have the reeult ^ ^ (32) 


Similar equations hold for M x and My. Tbey are the analogues of (69) 
of g 25. Thus the total angular momentum is connected with the rota- 
tion operators as the total momentum is connected with the displacement 
operators . This conclusion is valid for any point as origin. J 

Hie above argument applies to the angular momentum arising 
from the motion of particles, defined by (22) for each particle. There 
is another kind of angular momentum occurring in atomic theory, 
spin angular momentum . The former kind of angular momentum will 
be called orbital angular momentum , to distinguish it. The spin angu- 
lar momentum of a particle should be pictured as due tosome internal 
motion of the particle, so that it is associated with different degrees 
of freedom from those describing the motion of the particle as a whole. 
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and henoe the dynamical variables that describe the spin must com- 
mute with x , y, z, p x , p v , and p M . The spin does not correspond very \ 
closely to anything in classical mechanics, so the method of classical 1 
analogy is not suitable for studying it. However, we can build up a I 
theory of the spin simply from the assumption that the components 
of the spin angular momentum are connected with the rotation opera- 
tors in the same way as we had above for orbital angular momentum, 
i.e. equation (32) holds with M* as the z component of the spin angular 
momentum of a particle and r e as the rotation operator about the 
z-axis referriiig to states of spin of that particle. With this assump- 
tion, the commutation relations connecting the components of the 
spin angular momentum M with any vector A referring to the spin 
must be of the standard form (30), and hence, taking A to be the 
spin angular momentum itself, we have equation (29) holding also 
for the spin. We now have (29) holding quite generally, for any sum 
of spin and orbital angular momenta, and also (30) will hold generally, 
for M the total spin and orbital angular momentum and A any vector 
dynamical variable, and the connexion between angular momentum 
and rotation operators will be always valid. 

As an immediate consequence of this connexion, we can deduce the 
I law of conservation of angular momentum . For an isolated system, the 
I Hamiltonian must be unchanged by any rotation about the origin, in 
other words it must be a scalar, so it must commute with the angular 
momentum about the origin. Thus the angular momentum is a 
constant of the motion. For this argument the origin may be any 
point. 

As a second immediate consequence, we can deduce that a state 
with zero total angular momentum is spherically symmetrical . .The state 
will correspond to a ket \S), say, satisfying 

M x \8) = My\S) = M,\8> = 0 , 
and hence r x \8) — r y |$> = r f \8) = 0. 

This shows that the ket 1 8) is unaltered by infinitesimal rotations, 
and it must therefore be unaltered by finite rotations, since the latter 
can be built up fir om infinitesimal ones. Thus the state is spherically 
P symmetrical. The converse theorem, a spherically symmetrical state 
l has zero total angular momentum , is also true, though its proof is not 
! quite so simple. A spherically symmetrical state corresponds to a ket 
|&> whose direction is unaltered by any rotation. Thus the ohange 



144 ELEMENTARY APPLICATIONS f 86 

in |/8> produced by a rotation operator r x , r v , or r/must be a numerical 
multiple of \S}, say 

r x \8> = c x \sy , r y |S> = c y |<S>, r,\S> = 

where the c’s are numbers. This gives 

M X \S> = ibc x \8\ My\S> = ibc y \8\ 

K\8> = ite,\8>. (33) 

These equations are not consistent with the commutation relations 
(29) for M x> My , M z unless c x = c v = c e = 0, in whioh case the state 
has zerd total angular momentum. We have in (33) an example of 
a ket which is simultaneously an eigenket of the three non-commuting 
linear operators M x , My , M e , and this is possible only if all three 
eigenvalues are zero. 

36. Properties of angular momentum 

There are some general properties of angular momentum, deducible 
simply from the commutation relations between the three compo- 
nents. These properties must hold equally for spin and orbital angular 
momentum. Let m x , m y , m z be the three components of an angular 
momentum, and introduce the quantity j8 defined by 

P = 

Since P is a scalar it must commute with m x , m v , and m B . Let us 
suppose we have a dynamical system for which m x , m y , m e are the 
only dynamical variables. Then p oommutes with everything and* 
must be a number. We can study this dynamical system on much 
the same lines as we used for the harmonic oscillator in § 34. 

Put m x —fymy = rj. 

From the commutation relations (27) we get 
• , iji) = {m x +im u ){jn x —im v ) = t^+n%—i(m x m v ~'-m v m x ) 


' . = p-mi+fim, (34) 

and similar ly yjyj = P — m* — bm a . (35) 

Thus yjyj — 7777 = 2 Km B . (36) 

Also = ihm,y—ftm x = —hi). (37) 


We assume that the components of an angular momentum are 
observables and thus m g has eigenvalues. Let m a be one of them, 
and |tni> an eigenket belonging to it. From (34) 

•> *■ = (P—m a *+7hn a )(m a |mi> . 
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. x*> 

The left-hand side fltereis the square of the length of the ket rj\ mi> 
and is thus greater than or equal to zero, the case of equality occur- 
ring if and only if ijimi) = 0. Hence 
p-m' t *+hm' e > 0, 

or p+ih* 3* (*+* nw ^ Wl ) (38) 

Thus 0+ift 2 ^ 0. 

Defining the number k by 

k+\h = (P+W)* = K+mJ+mJ+i^ 2 )*, (39) 

so that k > — p, the inequality (38) becomes 
k+£ft ^ K-P| 

or k+ft ^ mi > —A;. (40) 

An equality occurs if and only if ^|m'> = 0. Similarly from (35) 

showing that j9— m' 2 — ftmi ^ 0 

or k ^ mi > — k— ft, 

with an equality occurring if and only if rj\ mi> = 0. This result 
combined with (40) shows that k ^ 0 and 

k ^ m e ^ — k, (41) 

with m'g = k if ij|mi> = 0 and m e — —k if 7^]m'> = 0. 

From (37) 

= (v m z~^v)\ m z> = K-%K>. 

Now if rrig = 5 * — k, 7^|m'> is not zero and is then an eigenket of m e 
belonging to the eigenvalue trig — ft^imilarly, if mi— ft ^ — k, mi— 2ft 
is another eigenvalue of m gi and so on. We get in this way a series 
of eigenvalues mi, mi— ft, mi— 2ft,..., which mugi terminate from (41), 
and can terrifinate only with the value —k. Again, from thb conjugate 
complex of equation (37) * 

m 0 ^|mi> = (fjm e +fifj)\m'g) = (mi+ft)ij|mi>, 
showing that mi+ft is another eigenvalue of m 8 unless 7j|mi> = 0, in 
which case mi = k. Continuing in this way we get a series of eigen- 
values mi, mi+ft* mi+2ft,..., which must terminate from (41), and 
nan ter minat e only with the value k. We can conclude that 2 k is an 
integral multiple of ft and that the eigenvalues, of tn g are \ 

k, k-ft, k— 2ft, ..., —k+ft, -k. (42) ** 

L 


IHMf 
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The eigenvalues of m x and m y are the same* from symmetry. These 
eigenvalues are all integral or half odd integral multiples of ft, accord- 
ing to whether 2k is an even or odd multiple of h. 

Let | max) be an eigenket of m B belonging to the mA-rimum eigen- 


value^, so that 


rj |max> = 0, 


(43) 


and $>rmtEe sequence of kets 

* ‘ i7**/*|max>. * (44) 

' n.M Vt ’’ . - K 

These kets are all eigenkets of m B , belonging to the sequence of eigen- 
values (42) respectively. The set of kets (44) is such that the operator 
rj applied to any one of them gives a ket dependent on the set (rj 
applied to the last gives zero), and from (36) and (43) one sees 
that rj applied to any one of the set also gives a ket dependent on the 
set. All the dynamical variables for the system we are now dealing 
with are expressible in terms of 77 and rj, so the set of kets (44) is a 


complete set. There is just o^jb of these kets for each eigenvalue (42) 
of m M , so m M by itself forms a complete commuting set of observables. 

It is convenient to define the magnitude of the an gular momentum 
vector m to be k, given by (39), rather than 0*, because the possible 
values for k are 0 , ft *, ft 2* (45) 


extending to infinity, while the possible values for are a more 
complicated set of numbers. 

For a dynamical system involving other dynamical variables besides 
m x , m y , and m„, there may be variables that do not commute with p. 
Then p is no longer a number, but a general linear operator. This 
happens for any orbital angular momentum (22), as x , y, z, p x , p y , and 
p B to not commute with p. We shall assume that P is always an 
observable, and k can then be defined by (39) with the positive square 
ropt function and is also an observable. We shall cal Tk so defined 
thd magnitude of the angular momentum vector m in the general 
ccjae. The above analysis by which we obtained the eigenvalues of 
m B is still valid if we replace | m B > by a simultaneous eigenket \k'm B } 
of the commuting observables k and and ^eads to the result that 
the possible eigenvalues for k are the^punrbers (45), and for eaoh 
eigenvalue k* of k the^igenvalues of m B are the numbers (42) with k* 
substituted for k. We have here an example of a phenomenon whioh 
*We have not met with previously, namely that with ^ wo commuting 
^hservabtes, the eigenvalues of one depend on what eigenvalue we 
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assign to the other. This phenomenon may be understood as the two 
observables being not altogether independent, but partially functions 
of one another. The number of independent simultaneous eigehkets 
of k and m B belonging to the eigenvalues k' and m' B must be indepen- 
dent of tri Bf since for each independent we oan obtain an 

independent for any m\ in the sequence (42), by multiplying 

\k'm B y by a suitable power of rj or rj. „ x * 

As an example let us consider a dynamical system witljjwo angulaf 
momenta m 1 and m 2 , which commute with one another. If there are 
no other dynamical variables, then all the dynamical variables com- 
mute with the magnitudes k x and k 2 of m 1 and m a , so k x and k 2 are 
numbers. However, the magnitude K of the resultant angular 
momentum M = n^+mg is not a number (it does not commute 
with the components of it^ and m 2 ) and it is interesting to work out 
the eigenvalues of K . This can be done most simply by a method 
of counting independent kets. There is one independent simultaneous 
eigenket of and m u belonging to an^ eigenvalue having one of 
the values k v k x —h, k 1 —2h,..., —k x and any eigenvalue having one 
of the values k 2 , k 2 —h, k 2 +- 2ft,..., —k 2 , and this ket is an eigenket 
of M z belonging to the eigenvalue M B = The possible 

values of if' are thus k 1 +k 2 ,k 1 +k 2 —ft,k 1 +k 2 —2ft,...,—k 1 —k 2 , and 
the number of times each of them occurs is given by the following 
scheme (if We assume for definiteness that k x ^ k 2 ), 

k x ~\~ m k 2 , k x -\-k 2 ft, k x -\-k 2 2ft,..., k x k 2i k x k 2 ft,... 

1 2 3 ... 2k 2 +l 2k 2 +l ... 

... — k x -\-k 2) — k x -\-k 2 ft,..., — k x — k 2 
... 2fc 2 +l 2 k 2 ... 1 

Now each eigenvalue K' of K will be associated with the eigenvalues 
K\ K’—ft, K , —2ft,... ) —K' for Mg, with the same number of indepen- 
dent simultaneous eigenkets of K and Mg for each of them. The total 
number of independent eigenkets of Mg belonging to any eigenvalue 
Mg must be the same, whether we take them to be simultaneous 
eigenkets of and frig, orjpiultaneous eigenkets of K and Mg, i.e. 
it is always givenby the scheme ^40). It follow that the eigenvalues 
for K are t 

fli+fc* — & 1 +& 1 — *^i — (47} # 
and that for eaoh of these eigenvalues for K and an eigenvalue fof 
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M a going with it there is just one independent simultaneous eigenket 
of K and M t . 

The effect of rotations on eigenkets of angular momentum variables 
should be noted. Take any, eigenket \M a > of tW z component of total 
angular momentum for any dynamical system, and ^pply to it a.small 
rotation through an angle 8 <j> about the z-axis. It will change into 
(i+8^)|JO = (i-a*j^|jf;> 
with the help of (32). This equals 

t if;> 

to the first order in 8 <f>. Thus \M a y gets multiplied by the numerical 
factor e~^ M ^ h . By applying a succession of these small rotations, we 
find that the application of a finite rotation through an angle </> about 
the z-axis causes \M a y to get multiplied by Putting </> = 2tt, 

we fin d that an appli cation of one revolution about th e z-axis leave s 
\M' a } unchan ged if the eigenvalue M a is an integral multip le of ft a nd 
i causes \M a y to change sign if Jfj is half an odd integral m ultiple of & 
Now consfder aneigenket \K'y of the magnitude K of the total angu- 
lar momentum. If the eigenvalue K' is an integral multiple of ft, the 
possible eigenvalues of M z are all integral multiples of ft and the applica- 
tion of one revolution about the z-axis must leave | K’y undhanged. 
Conversely, if K f is half an odd integral multiple of ft, the possible eigen- 
values of M a are all half odd integral multiples of ft and the revolution 
must change the sign of \K’y. From symmetry, the application of a 
revolution about any other axis must have the same effect on | K’y 
as one about the z-axis. We thus get the general result, the a^Ucation 
of one revolution about any axis leaves a ket unc hang ed or conges its_ 
s fytt according to whether it belongs to eigenv alues of the magnitude of 
tht M ai angul ar 

multiples of ft. A state, of course, is always unaffected by the revolu- 
taon,sincea state is unaffected by a change of sign of the ket corre- 
sponding. to it. 

For a dynamical system involving only orbital angular momenta, 
a ket must be unchanged by a revolution abqjpt an axis, since we oan 
set up Schrodinger’s representation, wMh the coordinates of all the 
particles diagonal, ah<£ the Sohrddfhger representative of a ket will 
get brought baok to its original value by the revolution. It follows 

o f an orbital angular momentum. 
The eigenvalues- of a oompOnent 
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of on orbital angular momentum are also always integral multiples 
of ft. For a^enin angular momentum, Sohrodinger’s representation 
does not exist and both kinds of eigenvalue are possible. 

37. The spin of the electron 

Electrons, amt also some of the other fundamental particles (pro- 
tons, neutrons) have a spin whose magnitude is £%. This is found 
from experimental evidence, and also there are theoretical reasons 
showing that this spin value is more elementary than any other, even 
spin zero (see Chapter XI). The study of this particular spin is there- 
fore of special importance. 

For dealing with an angular momentum m whose magnitude is 

(48) 

a v °.-°. 0 v = 2«r* ■ \ ^ 

(49) 


it is convenient to put m _ ^ 

The components of the vector o then satisfy, from (27), 


a e c r x —<r x <J e = 2t<7 y , 

2i< V 

The eigenvalues of m t are and —$ft, so the eigenvalues of ar e are 1 
and — 1, and of has just the one eigenvalue 1. It follows that oj must 
equal l, and similarly for o| and oj, i.e. 

*S = aJ = oJ=l. (50)' 

We can get equations (49) and (50) into a simpler form by means of 
some .straightforward non-commutative algebra. From (50) 


°v°z~ 


= 0 


Or S(°9 0, «“ a z°v) + { (J y (J z— WyWy = 0 

or o y c x +o x a y = 0 

with the help of the first of equations (49). This means a x a v = —cr y cr x . 
Two dynamical variables or linear operators like these which satisfy 
the commutative law of multiplication except for a minus sign will 
be said to anfccommvte. Thus <r x anticommutes with a y . From sym- 
metry each of the three dynamical variables a x , v y , c g must anti- 
commute with any other. Equations (49) may now be written 
a y tj M = ia x = —0 t o y , 

o M a x = iffy = — c x a gf } (51) 

***«, * * 


and also from (f>0) 


(52) 
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Equations (50), (51), (52) are the fundamental equations satisfied by 
the spin variables a describing a spin whose magnitude is 
Let us set up a matrix representation for the a’s and let us take u m 
to be diagonal. If there are no other independent dynamical variables 
besides the m 9 s or o’s in our dynamical system, then by itself forms 
a complete set of commuting observables, since the form of equations 
(60) and (51) is such that we cannot construct out of a x , cr v , and a z 
any new dynamical variable that commutes with a e . The diagonal 
elements of the matrix representing a B being the eigenvalues 1 and 
— 1 of o gt the matrix itself will be 


Let cr x be represented by 


C 4 

K a 2 \ 
W aj' 


This matrix must be Hermitian, so that a ± and a 4 must be real and 
a % and a s conjugate complex numbers. The equation <j b o x = —o x a u 

giVe8US / «x «,\_K 

a 8 —aj \a 8 — aj 9 

so that Oj as a 4 = 0. Hence a x is represented by a matrix of the form 


(° H 

\<h 


The equation oj = 1 now shows that o 2 a 3 = 1. Thus a 2 and a 8 , being 
conjugate complex numbers, must be of the form e i0L and er iot re- 
spectively, where a is a real number, so that a x is represented by a 
matrix of the form , 0 e<a v 

f («-*“ 0 )' 

Similarly it may be shown that a y is also represented by a matrix of 
this form. By suitably choosing the phase factors in the representa- 
tion, which is not completely determined by the condition that a M 
shall be diagonal, we can arrange that a x shall be represented by the 
matrix , Q n 

[l 0 ]■ 

The representative of a v is then determined by the equation 
= io x <r g . We thus obtain finally the three matrices 


(? i)- e -»)■ (i 4 - 
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to represent o xi a yi and o B respectively, which matrices satisfy all the 
algebraio relations (49), (60), (51), (52). The component of the vector 
a in an arbitrary direction specified by the direction cosines 2, m, n, 
namely lo x +mcr v +n<j g , is represented by 


( n l—im\ 
l+im —n I 


The representative of a ket vector will consist of just two numbers, 
corresponding to the two values + 1 and — 1 for a B . These two num- 
bers form a function of the variable o B whose domain consists of only 
the two points +1 and —1. The state for which o B has the value unity . 
will be represented by the function, f a (a B ) say, consisting of the pair : 
of numbers 1, 0 and that for which <r„ has the value — 1 will be \ 
represented by the function, fp(cr B ) say, consisting of the pair 0 , 1 . j 
Any function of the variable a B , i.e. any pair of numbers, can be 
expressed as a linear combination of these two. Thus any state can 
be obtained by superposition of the two states for which o e equals + 1 and 
—1 respectively . For example, the state for which the component of 
o in the direction l, m , n, represented by (54), has the value +1 is 
represented by the pair of lumbers a, b which satisfy 


/ n 

l—im\la\ la\ 


-n J\b) = \bj 


or ‘ na+(l—im)b = a, 

(l+im)a—nb = 6. 

rritMn * a l ~ im l + n 

b 1— n l+\m 

This state can be regarded as a superposition of the two states for 
which cr B equals + 1 and — 1 , the relative weights in the superposition 
process being as 

|a|* : |6| a = |2— im| a : (1-n)* = l+n : 1-n. (55) 


For the complete description of an electron (or other elementary 
partiole with spin $ft) we require the spin dynamical variables a , 
whose connexion with the spin angular momentum is given by (48), 
together with the Cartesian coordinates x, y, z and momenta p x , p yi 
p B . The spin dynamical variables commute with these coordinates 
and momenta. Thus a complete set of commuting observables for a 
system consisting of a single eleotron will be x,'y, z, o B . In a repre- 
sentation in which these are diagonal, the representative of any state 
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will be a function of four variables x\ y\ z\ Since a g has a domain 
consisting of only two points, namely 1 and —1, this function of four 
^variables is the same as two functions of three variables, namely the 
two functions 


<®'yV[> + = 1|>, <x'y'z‘ 1>_ = 1|>. (58) 



38. Motion in a central field of force 

An atom consists of a massive positively charged nucleus together 
•with a number of electrons moving round, under the influence of the 

t ractive force of the nuoleus and their own mutual repulsions. An 
act treatment of this dynamical system is a very difficult mathe- 
matical problem. One can, however, gain some insight into the main 
feature of the system by making the rough approximation of regard- 
ing each electron as moving independently in a certain central field 
of force, namely that of the nucleus, assumed fixed, together with 
some kind of average of the forces due to the other electrons. Thus 
our present problem of the motion of a particle in a central field of 
force forms a corner-stone in the theory of the atom. 

Let the Cartesian coordinates of the particle, referred to a system 
of axes with the centre of force as origin, be x, y, z and the corre- 
sponding components of momentum p x , p yy p g . The Hamiltonian, 
with neglect of relativistic mechanics, will be of the form 

H = l/2m.(p|+pJ+p*)+F, (67) 

wiqgpNFt the potential energy, is a function only of (« 2 +y a +z a ). To 
develop the theory it is coii^hient to introduce polar dynamical 
variables. We introduce firsi the radius r, defined as the positive 

Its eigenvalues go from 0 to oo. If ^e evaluate its P.B.s with p x9 p yt 
and we obtain, with the help of formula (32) of § 22, 



the same as in the classical theory. We introduce also the dynamical 
variable p r defined by 

Pr = r-'ixpt+ypy+zp,). 
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its P.B. with r is given by 

r [ r >Pr] = fcrPr] = [r,zp x +yp v +zp t ] 

= x.x/r+y.yfr+z.z/r = r. 

Henoe [r,pj = 1 

or rp r — p r f = ih. 

The commutation relation between r and p r is just the one for a 
canonical coordinate and momentum, namely equation (10) of § 22. 
This makes p f like the momentum conjugate to the r coordinate, but 
it is not exactly equal to this momentum because it is not real, its 
conjugate complex being 

Pr= [PxX+PvV+PzZ)^ 1 = (xp x +yPv+ Z Pe- 3ih ) r ~ 1 

^ = (rpr—Uhy- 1 = p r —2ihr~ 1 . v (59) 

Thus Pf—iftr- 1 is real and is the true momentum conjugate to r. 

The angular momentum m of the particle about the origin is given 
by (22) and its magnitude k is given by (39). Since r and p r are 
scalars, they commute with m, and therefore also with k. 

We can express the Hamiltonian in terms of r, p r , and k. We have, 
4 if 2 denotes a sum over cyclic permutations of the suffixes x, y t z , 

XV* 

k(k+K) = 2 m l= z 2 (xPy—yPx)* 
xv* xy * 

= 2 (*p v zPv+yPx yPx-Wv vPx-yPx *p v ) 

xv * 

= 2 ( x 'Y v +y i p%-xpxP v y-yp v p x z+x i pl-zp x Pz : x- 

XV * -V. 

... -2tfotp x ) 

= (**+y , +z*)(pi+$+2>f)-||/ * 

-(*Pz+y^$JP*2+P v y+P» e + 2**) 

= r*(l»*+^+l>J)-rPr(Pr» , + 2 **) 

from (59). Hence 


This form for H is such that k oommutes not only with H, as is 
neoessary sinoefc is a constant of the motion, but also with'Overy 
dynamical variable occurring in 27, namely r, p r , and F, which is a 


-l- 

2ro\r 


Jc(k+K)' 


! )+f. 


( 60 ) 
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function of r. In consequence, a simple treatment becomes possibly, 
namely, we may consider an eigenstate of k belonging to an eigen- 
value k ' and then wean substitute k* for i in (60) and get a problem 
in one degree of fitedom A f t , * *' 

Let us introduce Schrodinger’s representation with x 9 y> z diagonal. 
Then Pa., p y \p B are equal to the operators —iKd/dx, — iKd/dy , — t Kd[dz 
respectively. A state is Yepresentefll by a wave function *p(xyzt) satis- 
fying Schrodinger’s wave equation (7) of § 27, which now reads, with 
H given by (57), 

*-■» 


ih 


8t 


{-m+ 


dy*^dz*J 


(61) 


We may pass from, the Gartesian coordinates x, to the polar 
coordinates r,0, $ % means of the equations 


rsih0o6s^, 
y = rsin0sin<^, 
z = fdosfl, 




and may express the%ave function in terms of the polar coordinates, 
so that it reads ip(r6$t). The equationd^62) give the operator equation 

a — 4.^ a n x ji z i. 

drdy^drdz^ r dx + r dy*r dz 9 * 

\ x '' ' \ •* , ’ ‘jjp t 4 ^ \ 

,,«whieh shty^s, qn being coMpared with (58), that p r = —ih S/Sr. Thus 
* Shhrodinger’s wave equation* reads, wfifc the form (60) for H, 

Here k is a certain linear operator which, since it commutes with r 
and d/Sr, can involve only 0, <f>, 8 1 SB, and 8/8$. From the formula 
k(k+h) = «jjJ+m»+mJ, (64) 

which comes fi$s>m (39), and from. (62) one can work out the fond of 
k(k+h) and dne finds 


1 8 


0* 


(65) 


This operatoi^ia ^pll known- in mathematical physios. Its eigen- 
functions are o&\ed^apherical harmonies and its eigenvalues are 
n(»+i) where ' » is an integer* Thus the theory ofiphericil har- 

monios provides Id. alternative proof that *- A *" — 

lof*. 
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For an eigenstate of k belonging to the eigenvalue nft (n a non- 
negative integer) the wav6 function will be of the form 

If *: (68) 

where 8 n (8<f>) satisfies ' \ 

. k(k+K)S n m = •fo+ljW.W), (67) 

i.e. from (65) S n is & spherical hphnopic of order n. Thj factor r _1 
is inserted in (66) for convenience. Substituting (66) into (63), we 
get as the equation for x 

' • - « 
If the state is a stationary state belonging to the energy value H\ 


X(rt> = Xa( r ) e ^ B ' m 


(69) 


X will be of the form 

and (68) will reduoS to 

fl'v - ( h '( ^ n(n+l)\ v \ 

r dr a+ r 2 ) +V ] Xo ' 

This equation may be used to determine the energy-levels W of the 
system. For each solution xo of (6$, arising from a given n, there 
will be 2n+l independent states, because there arp 2^+1 indepen- 
dent |£putions of (67) corresponding to the 1 ^different values 
that a component of thrangular momerifem say, can. take on; ** 
The probability of the gja^icle being in an element of voluide 
dxdydz is moportional to \ifr\ 2 dxdydz. Hith i/j of the form (66) this 
becjomes l^lxl^^^da^^The probability of the particle bdtfjg in 
a spherical shell between r and r+dr is then proportional to |xl a d/. 
It now beoomes clear that, in solving equation (68) or (69), we must 
impose a boundary condition on the function x at r = 0, namely the 
function must be such that the), integral to the origin f |x§dr is 

convergent. If this integral were not convergent, the wave function 
would represent a state for w^ioh the chances are infinitely in favour, 
of th$ particle being at the origin and suqh^.state would not be 
physically admissible. * ^ '*• 

The boundary condition at r == 0 obtained by ^he'above considera- 
tion of probabilities is, however, not sufficiently stringent. We get a 
l by verifying that the wave funotiobobtained 



iquataon in polar coordinates {63) really satisfies 
Cartesian coordinates (61). ‘ Let us take the case 
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of V = 0, giving us the problem of the free particle. Applied to a 
stationary state with energy H ' = D, equation (61) gives 


VV = 0, (70) 

where V a is written for, the Laplacian operator d 2 /dx 2 + d 2 /dy 2 + d 2 ldz 2 , 
and equation (63) gives 


[ 11 L r _w±i>V = o. 

\r8r * «V* ) V 


(71) 


A solution of (71) for k = 0 is ^ = r -1 . This does not satisfy 
(70), since, although V^- 1 vanishes for any finite value of r, its integral 
through a volume containing the origin is —4 7t (as may be verified 
by transforming this volume integral to a surface integral by means 
of Gauss’s theorem), and hence 


= -47r8(*)8 (y)h(z). 


(72) 


T^hus not every solution of (71) gives a solution of (70), and more 
generally, not every solution of (63) is a solution of (61). We must 
impose on the solution of (63) the condition that it shall not tend to 
infinity as rapidly as r -1 when r-> 0 in order that, when substituted 
into (61), it shall not give a 8 function on the right like the right-hand 
side of (72). Only when equation (63) is supplemented with this condi- 
tion does it become equivalent to equation (61). We thus have the 
boundary condition -* 0 or * 0 as r -> 0 . 

There are also boundary conditions for the wave function at r — oo. 
If we are interested only in ‘closed* states, i.e. states for which the 
particle does not go off to |nfinity, we must restrict the integral to 

infinity J |x(r)|* dr to be convergent. These closed states, however, 
,are not the only ones that are physically permissible, as we can also 
lUfye states in which the particle arrives from infinity, is scattered 
by the central field of force, and goes off to infinity again. For these 
states the wave function may remain finite as r -►oo. Such states will 
he dealt with m Chapter vtu under the heeding of collision problems. 
In any case the wave function must not tend to infinity as r -> oo, or 
it will represent a state that has no physical meaning. 


39. Energy-levels of the hydrogen atom 
®ie <fcoVe analysis may be applied to the problem*# the hydrogen 
ato^ 'with neglect of relativistio mechanics and Ah e spin of the 
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electron. The potential energy V is nowf — e a /r, so that equation 
(69) becomes 


(d*__ 
< ir 2 


ft(ft+l) 2 me* 1 


ft 2 


1 ) 

”|Xo = 


2m#' 


ft 2 


-Xo* 


(73) 


A thorough investigation of this equation has been given by Schro- 
dinger.J We shall here obtain its eigenvalues H' by an elementary 
argument. t 

It is convenient to put 

Xo — f( r ) e ~ r,a > (74) 


introducing the new function f(r), where a is one or other of the 
square roots mH'). (75) 


Equation (73) now becomes 

fd* 2d n(n+l) 2 11 a . _ 
(dr* a dr r* ^ ft* >r\ n ' 


(76) 


.We look for a solution of this equation in the form of a power series 

/(»•) = I (77) 

S 

in which consecutive values for 8 differ by unity although these 
values themselves need not be integers. On substituting (77) in (76) 
we obtain * 


l)r«“ 2 — (2s/a)r B ” 1 — nfn+l^^+^me 2 /^ 2 )/®- 1 } = 0, 

» 

which gives, on equating to zero the coefficient of r®- 2 , the following 
relation between successive coefficients c s , 

, cfc(8-l)-n(n+l)] = v 1 [2(/-l)/a~2me 2 /^ 2 ]. (78) 

We saw in the preceding section that only those eigenfunctions x 
are allowed that tend to zero with r and hence, from (74), f(r ) miyst ' 
tend to zero with r. The series (77) must therefore terminate on the 
side of small 8 and the minimum value of 8 must be greater than zero. 
Now the only possible minimum values of £ are those tjbat make the 
coefficient of c, in (78) vanish, i.e. n + 1 and —ft, aim the second 
of these is negative or zero. Thus the minimum value of 8 must be 
n+ 1. Since n is always an integer, the values of 8 will all be integers. 

t The • here, denoting minus the charge on an electron, is, of course, tp be.^is- 
tinguished from the e denoting the base of exponentials* ** 

t SchrCdinger, Atm . d. Phynk , 79 (1926), 361. 

t 
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The series (77) will in general extend to infinity on the side of large 8. 
For large values of 8 the ratio of successive terms is 

c a r _ 2 r 


i 


8a 


according to (78). Thus the series (77) will always converge, as the 
ratios of the higher terms to one another are the same as for the 


senes 




(79) 


which converges to e 2 ^ 0 . 

We must now examine how our solution Xo behaves for large 
values of r. We must distinguish between the two cases of H' positive 
and H r negative. For H' negative, a given by (75) will be real. Sup- 
pose we take the positive value for a. Then as r -> oo the sum of the 
series (77) will tend to infinity according to the same law as the sum 
of the series (79), i.e. the law e 2r K Thus, from (74), xo will tend to 
infinity according to the law e rIa and will not represent a physically 
possible state. There is therefore in general no permissible solution 
of (73) for negative values of J5P. An exception arises, however, when- 
ever the series (77) terminates on the side of large 8, in which case the 
boundary conditions are all satisfied. The condition for this termina- 
tion of the series is that the coefficient of c -HL in (78) shall vanish for 
some value of the suffix 8—1 not less than its minimum value n+1, 
which is the same gp the condition that 

8 me 2 

H* 




for some integer 8 not less than n+1. With the help of (75) this 
oendition beoomes 

< w > 


and is thus a condition for the energy-level IT. Since 8 may be any 
positive integer, the formula (80) gives a discrete set of negative 
energy-levelarfor the hydrogen atom. These are in agreement with 
experiment. For eaSjaof^them (except the lowest one 8=1) there 
are several indepengremt states, as there are various possible values 
ftp n, namely any positive or zero integer less than 8. This multi- 
plicity <5§ statee belonging to an energy-level is in Action to that 
mentioned in the preceding section arising from the various possible 
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values for a component of angular momentum, whioh latter multi- 
plicity ooours with any oentral field of force. The n multiplicity oocurs 
only with an inverse square law of force and even then is removed 
when one takes relativistic mechanics into aooount, as will be found 
in Chapter XI. The solution xo of (73) when H' satisfies (80) tends to 
zero exponentially as r -> oo and thus represents a closed state (corre- 
sponding to an elliptic orbit in Bohr’s theory). 

For any positive values of H', a given by (75) will be pure imaginary. 
The series (77), which is like the series (79) for large r, will now have a 
sum that remains finite as r->oo. Thus x 0 given by (74) will now remain 
finite as r ->oo and will therefore be a permissible solution of (73), 
giving a wave function 0 that tends to zero according to the law r _1 as 
r ->oo. Hence in addition to the discrete set of negative energy-levels 
(80), all positive energy-levels are allowed. The states of positive 

* ? 
energy are not closed, since for them the integral to infinity j \xo\ 2 dr 

does not converge. (These states correspond to the hyperbolic orbits 

of Bohr’s theory.) 

40. Selection rules 

If a dynamical system is set up in a certain stationary state, it will 
remain in that stationary state so long as it is not acted upon by 
outside foroes. Any atomic system in practice, however, frequently 
gets acted upon by external electromagnetic fields, under whose 
influence it is liable to cease to be in one stationary state and to make 
a transition to another. The theory of such transitions will be de- 
veloped in §§ 44 and 45. A result of this theory is that, to a high degree 
of accuracy, transitions between two states cannot occur under the 
influence of electromagnetic radiation if, in a Heisenberg representa- 
tion with these two stationary states as two of the basic states, the 
matrix element, referring to these two states, of the representative 
of the total electric displacement D of the system vanishes. Now it 
happens for many atomic systems that the great majority of the 
matrix elements of D in a Heisenberg representation dq vanish, and 
henoe there are severe limitations on the possibilities for transitions. 
The rules that express these limitations arecd|j|d selection rules . 

The idea of selection rules can be refined^ a more detailed 
application of the theory of §§44 and 45, ! according to which 
the matrix elements of the different Cartesian components of the 
vector D are asqpciated with different states of polarization of the 
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electromagnetic radiation. The nature of this association is just what 
one would get if one considered the matrix elements, or rather their 
real parts, as the amplitudes of harmonic oscillators which interact 
with the field of radiation according to classical electrodynamics. 

There is a general method for obtaining all selection rules, as 
follows. Let us call the constants of the motion which are diagonal in 
the Heisenberg representation a’s and let D be one of the Cartesian 
components of D. We must obtain an algebraic equation connecting 
D and the a’s which does not involve any dynamical variables other 
than D and the a’s and which is linear in D. Such an equation will 
be of the form 

where the / r ’s and gr r ’s are functions of the a’s only. H this equation 
is expressed in terms of representatives, it gives us 

2/ r (cO<«'|Z? !</>?,(*') = o, 

r 

or <a'|D|ot*> 2/r(“')S f ,(“'’) = 0, 

r 

which shows that <a'|Z>|a*> = 0 unless 

2/,<«W) = 0. (82) 

r 

This last equation, giving the connexion which must exist between 
at and a # in order that <a'|D|a*> may not vanish, constitutes the 
selection rule, so ffcr as the component D of D is concerned. 

Our work on the harmonic oscillator in § 34 provides an example 
of a selection rule. Equation (8) is of the form (81) with rj for D and 
H playing the part of the a’s, and it shows that the matrix elements 
^H'\7j\H”) of 7 ? all vanish except those for which H'—H' = ftw. The 
; conjugate complex of this result is that the matrix elements <2T \rj | H*} 
of 7 } all vanish except those for which H*— H’ = —fu*). Since q is a 
numerical multiple of rj—rj, its matrix elements <£T \q\H”) all vanish 
exoept those for which JET— H f = ±fua. If the harmonic oscillator 
carries an electric charge, its electric displacement D will be pro- 
portional to q. The selection rule is then that only those transitions 
pan take place in which the energy H changes by a single yian- 
tliyn jfaitx 

We shall now obtain the selection rules for m M and k for an electron 
xnpving in a central field of force. The components of electric dis- 


ZfrDffr = 0 , 


(81) 
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placement are here proportional to the Cartesian coordinates x, y, z . 
Taking first m z , we have that m e commutes with z, or that 

m z z—zm z = 0. 

This is an equation of the required type (81), giving us the selection 

rule , „ 

m z —m\ = 0 

for the 2 -component of the displacement. Again, from equations 

(23) we have r r n r t 

[m z ,[m z ,x]] = [m z ,y}= -x 

or mlx—2m z xm z -\-xml—1Px = 0, 

which is also of the type (81) and gives us the selection rule 

m' 2 — 2m'mJ+mJ 2 — h* = 0 

or (m z —ml—h)(m e —m” z +h) = 0 

for the a-component of the displacement. The selection rule for the 
y-component is the same. Thus our selection rules for m z are that 


to an electric di pole in the z-direction, m z cannot charge, white in transi - 
tions assoc iated with a polarization corresponding to an electric dipole 
in the x-direction or y-direction . m j, chang e by ±8. 

We can determine more accurately the state oFpolarization of the 
, radiation associated with a transition in which m z changes by ±h, by 
considering the condition for the non-vanishing of matrix elements 
of x+iy and x—iy. We have 

[m e ,x+iy] = y—ix = — i{x+iy) 
or me ( x +iy)-( x +iy)(m z +h) = 0, 


which is again of the type (81). It gives 
m z —m M z ~h — 0 

as the condition that <mi|a;+ty|wi'> shall not vanish. Similarly, 
m z —ml+h = 0 


is the condition that (m z \x—iy\m” z ) shall not vanish. Hence 
<rri e \x— iy\m' B — h) = 0 

or ft} = i<,m z \y\m' a —hy = (a+itye* 0 * 

say, a, b , and o> being real. The conjugate complex of this is 
<p^—h\x\m f g) = — i(m' z — ft|y|mi> = (a— itye-* 0 *. 

Thus the vector £{<roi|D|ro;— ft|D|roi>}, whioh determines 

MU.87 M 
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the state of polarization of the radiation associated with transitions 
for which m' e = rri B —ft 9 has the following three components 

Wrf* \x K-ft) + * \x | m;» 

= \[(a-\-ib)e iwi -\-(a—ib)e- ioA } = a cos 6 sin wt, 

y K-ft> + «-* |y K» 

= (a+i6)e <ftrf +(a— 16)6*^} = a sin ci>£+6oos cut, 

From the form of these components we see that the associated radia- 
tion moving in the z -direction will be circularly polarized, that 
moving in any direction in the zi/-plane will be linearly polarized in 
this plane, and that moving in intermediate directions will be 
ehiptically polarized. The direction of circular polarization for radia- 
tion moving in the z -direction will depend on whether co is positive 
or negative, and this will depend on which of the two states m z or 
m z = has the greater energy. 

We shall now determine the selection rule for k. We have 

[&(*+«), z] = [wi|,z]+[m*,z] 

= —ym x —m x y+xm y +m v x 
— 2 (m y x—m x y+ihz) 

= 2 (m v x—ym x ) = 2{xm y —m x y). 

Similarly, [&(&+&), 3] = 2 (ym e —m y z) 

and [ k(k+h),y ] = 2(m x z—xm JB ). 

Hence 

[*(*+»),[*(*+»), z]] 

= 2[k(k+k),m v x— m x y+ibz\ 

— 2mJik(k+b),x]-2mJJc(k-{-b),y]-\-2i^k(k+1i),z] 

= 4wi v (ym JS — z) — 4m x (m x z—xm t ) + 2{k(k -\-b)z—zk(k+K)} 

— 4(m IE *+OT v y+m J z)w J ,— 4(m*+m*+m*)z+ 

+2{*(ifc+*)z-zJfe(i!:+*)}. 
From (22) m x x-\-m y y-\-m t z — 0 (84) 

and henoe 

[*(*+*),[*(*+*), z]] = -2{k{k+h)z+zk(k+h)}, 
which gives 

h*(fc+«)*z- 2k(k+K)zk(k+K)+zk*(k+K)>- 

-2**{fc(Jfc+*)z+zife(ife+*)} = 0. (85) 
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Similar equations hold for x and y. These equations are of the re- 
quired type (81), and give us the selection rule 
fc'2 (fc ' + ft)B_2 k'(k'+h)¥(¥+K)+¥*(¥+h)*- 

— 2h 2 ¥ (¥ +ft) — 2h*¥(¥+ft) = 0, 

which reduces to 

= 0 . 

A transition can take place between two states ¥ and ¥ only if one 
of these four factors vanishes. 

Now the first of the factors, (¥+¥+ 2£), can never vanish, sinoe 
the eigenvalues of k are all positive or zero. The second, (¥-{-¥), can 
vanish only if k’ = 0 and ¥ = 0. But transitions between two states 
with these values for k cannot occur on account of other selection 
rules, as may be seen from the following argument. If two states 
(labelled respectively with a single prime and a double prime) are 
such that ¥ = 0 and ¥ = 0, then from (41) and the corresponding 
results for m x and m y , m' x = m' y = m' = 0 and m” x — ml = mj = 0. 
The selection rule for m e now shows that the matrix elements of 
x and y referring to the two states must vanish, as the value of m e 
does not change during the transition, and the similar selection rule 
for or shows that the matrix element of z also vanishes. Thus 
transitions between the two states cannot occur. Our selection rule 
for k now reduces to 

= 0 , 

showing that k must change, by i/L This selection rule may be written 

k'*- 2 k'k’+i<r*-h* = o, 

and since this is the condition that a matrix element Oc'\z\Vy shall 
not vanish, we get the equation 

kht—Zkzk+zW—fPz = 0 

or [k,[k 9 z]]= -z, (80) 

a result which could not easily be obtained in a more direct way. 

As a final example we shall o'btain the selection rule for the magni- 
tude K of the total angular momentum M of a general atomic system. 
Let a,y,zbe the coordinates of one of the electrons. We must obtain 
the condition that the (K\ K”) matrix element of a?, y, or z shall not 
vanish. This is evidently the same as the condition that the {K' } K") 
matrix element of A 1} A 2 , or A s shall not vanish, where X v Aj, and A a 
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are any three independent linear functions of x, y , and z with numeri- 
cal coefficients, or more generally with any coefficients that commute 
with K and are thus represented by matrioes which are diagonal with 
respect to IT. Let ^ = ^y+^z, 

K — MyZ—M e y—ihx, 

Xy — M,x—M x z—ihy, 

A, = M x y—M v x—ihz. 

We have 

M x X x +My\+M z X x = 2 (M x MyZ—M x M„y—ihM x x) 

XV* 

= 2 (M x M v -M 1l M t -ihM,)z = 0 (87) 

xv* 

from (29). Thus A x , A y , and X z are not linearly independent functions 
of x , y, and z. Any two of them, however, together with Xq are three 
linearly independent functions of x , y, and z and may be taken as the 
above A 1? A 2 , A s , since the coefficients M x , M v , M e all commute with K. 
Our problem thus reduces to finding the condition that the (K\ K") 
matrix elements of A 0 , A x , Xy, and A^ shall not vanish. The physical 
meanings of these A’s are that Aq is proportional to the component of 
the vector (x, y, z) in the direction of the vector M, and A x , A^, X z are 
proportional to the Cartesian components of the component of (x, y, z) 
perpendicular to M. 

Since Xq is a scalar it must commute with K. It follows that only 
the diagonal elements <Z / |A 0 |X / > of Xq can differ from zero, so the 
selection rule is that K cannot change so far as A 0 is concerned. Apply- 
ing (30) to the vector A*, A v , A*, we have 

M.AJ = Xy, [34, Xy] = —X x , [34, AJ = o. 

These relations between M z and A*, A,,, A* are of exactly the same form 
as the relations (23), (24) between m 0 and x,y,z, and also (87) is of 
the same form as (84). The dynamical variables A x , Ay, A* thus have the 
same properties relative to the angular momentum M as x, y , z have 
relative to m. The deduction of the selection rule for k when the 
electrio displacement is proportional to (x, y, z) can therefore be taken 
over and applied to the selection rule fpr K when the electric displace- 
ment is proportional to (A*, Ay, A*). We find in this way that, so far as 
A^, Ay, Ag are concerned, the selection rule for K is that it must change 

by db#. 

Collecting results, we have as the selection rule for*£ that it must 
chang$ by 0 or We have considered the electric displacement 
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produced by only one of the electrons, but the same selection rule 
must hold for eaoh electron and thus also for the total electric dis- 
placement. 


41. The Zeeman effect for the hydrogen atom 
We shall now consider the system of a hydrogen atom in a uniform 
magnetic field. The Hamiltonian (57) with V = — e 2 jr , which describes 
the hydrogen atom in no external field, gets modified by the magnetic 
field, the modification, according to classical mechanics, consisting 
in the replacement of the components of momentum, p x , p y) p B , by 
p x +e/c.A xi p y +ejc.A v , p B +e/c.A B , where A x , A y> A z are the com- 
ponents of the vector potential describing the field. For a uniform 
field of magnitude Jt in the direction of the z-axis we may take 
A x = A y = \&x, A z = 0. The classical Hamiltonian will 


then be 


B - a(( p --55' % )’ + ( r " + r>) ,+rf l"7- 


This classical Hamiltonian may be taken over into the quantum 
theory if we add on to it a term giving the effect of the spin of the 
electron. According to experimental evidence and according to the 
theoiy of Chapter XI, the electron has a magnetic moment —ehfimc.a, 
where o is the spin vector of § 37 . The energy of this magnetic moment 
in the -magnetic field will be e&#/2mc.a e . Thus the total quantum 


Hamiltonian will be 

(88) 

There ought strictly to be other terms in this Hamiltonian giving the f 
interaction of the magnetic moment of the electron with the electric I 
field of the nucleus of the atom, but this effect is small, of the same 
order of magnitude as the correction one gets by taking relativistic 
mechanics into account, and will be neglected here. It will be taken 
into account in the relativistic theory of the electron given in 
Chapter XI. 

If the magnetio field is not too large, we can neglect terms involving 
JV a , so that the Hamiltonian (88) reduces to 




( 89 ) 
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The extra terms due to the magnetic held are now eJt/2mc . (m a +ftar B ). 
But these extra terms commute with the total Hamiltonian and are 
thus constants of the motion. This makes the problem very easy. 
The stationary states of the system, i.e. the eigenstates of the Hamil- 
tonian (89), will be those eigenstates of the Hamiltonian for no field 
that are simultaneously eigenstates of the observables m B and <t b , or 
at least of the one observable m B -\-fta B , and the energy-levels of the 
system will be those for the system with no field, given by (80) if 
one considers only dosed states, increased by an eigenvalue of 
eJf/2mc . (m B +fta B ). Thus stationary states of the system with no 
field for which tn M has the numerical value m B , an integral multiple 
of ft, and for which also o B has the numerical value a B = ± 1, will still 
be stationary states when the field is applied. Their energy will be 
increased by an amount consisting of the sum of two parts, a part 
eJ/l2mc.m B arising from the orbital motion, which part may be con- 
sidered as due to an orbital magnetic moment —emJ2mc, and a part 
eJtfimc.hog arising from the spin. The ratio of the orbital magnetic 
m^gpigiMlie orbital angular momentum m B is —c/2 me, which is 
nan tlie ratio of the spin magnetic moment to the spin angular 
momentum. This fact is sometimes referred to as the magneti c 
anomaly of the spin. 

Since the energy-levels now involve m zi the selection rule for m B 
obtained in the preceding section becomes capable of direct com- 
parison with experiment. We take a Heisenberg representation in 
which, among other constants of the motion, m B and a B are diagonal. 
The selection rule for m B now requires m B to change by ft, 0 , or —ft, 
while a B , since it commutes with the electric displacement, will not 
change at all. Thus the energy difference between the two states 
taking part in the transition process will differ by an amount 
eft 0. or —eftJtl2inc from its value for no magnetic field. 
Henoe, from Bohr’s frequency condition, the frequency of the 
associated electromagnetic radiation will differ by eJ^j^nmc, 0, or 
— eJt/isnnc from that for no magnetic field. This means that eaoh 
spectral line for no magnetic field gets split up by the field into three 
components. If one considers radiation moving in the ^-direction, 
then from (83) the two outer components will be circularly polarized, 
while the oentral undisplaced one will be of zero intensity. These 
results are in agreement with experiment and also With the classical 
theory of the Zeeman effect. 
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42. General remarks 

In the preceding chapter exact treatments were given of some simple 
dynamical systems in the quantum theory. Most quantum problems, 
however, cannot be solved exactly with the present resources of 
mathematics, as they lead to equations whose solutions cannot be 
expressed in finite terms with the help of the ordinary functions of 
analysis. For such problems one can often use a perturbation method. 
This consists in splitting up the Hamiltonian into two parts, one of 
which must be simple and the other small. The first part may then 
be considered as the Hamiltonian of a simplified or unperturbed 
system, which can be dealt with exactly, and the addition of the 
second will then require small corrections, of the nature of a perturba- 
tion, in the solution for the unperturbed system. The requirement 
that the first part shall be simple requires in practice that it shall not 
involve the time explicitly. If the second part contains n small 
numerical factor e, we can obtain the solution of our equations for 
the perturbed system in the form of a power series in c, which, pro- 
vided it converges, will give the answer to our problem with any 
desired accuracy. Even when the series does not converge, the first 
approximation obtained by means of it is usually fairly accurate. 

There are two distinct methods in perturbation theory. In one of 
these the perturbation is considered as causing a modification ofJhG 
states 0/ motion of the unperturbed system. In the other we do not 
consider any modification to be made in the states of the unperturbed 
system, but we suppose that the perturbed system, instead of remain- 
ing permanently in one of these states, is continually changing from 
one to another, or making tr ansitions , under the influence of the 
perturbation. Whioh method is to be used in any particular case 
depends on the nature of the problem to be solved. The first method 

_ _ . . . . 1.. /i 1 - - ~J-i 


Hamilt onian for the undisturbed system) does not involve the time 
explicitly, and is then applied to the stationary states. It can be used 
for calculating things that do not refer to any definite time, such as 


in the oase of collision problems, the probability of scattering through 
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a given angle. The second method must, on the other hand, be used 
for solving all problems involving a consideration of time, such as 
those about the transient phenomena that occur when the perturba- 
tion is suddenly applied, or more generally problems in which the 
perturbation varies with the time in any way (i.e. in which the per- 
turbing energy involves the time explicitly). Again, this second 
method must be used in collision problems, even though the per- 
turbing energy does not here involve the time explicitly, if one 
wishes to calculate absorption and emission probabilities, since these 
probabilities, unlike a scattering probability, cannot be defined with- 
out reference to a state of affairs that varies with the time. 

One can summarize the distinctive features of the two methods by 
saying that, with the first method, one compares the stationary states 
of the perturbed system with those of the unperturbed system; with 
the second method one takes a stationary state of the unperturbed 
system and sees how it varies with time under the influence of the 
perturbation. 

43. The change in the energy -levels caused by a perturbation 

The first of the above-mentioned methods will now be applied to 
the calculation of the changes in the energy-levels of a system caused 
by a perturbation. We assume the perturbing energy, like the Hamil- 
tonian for the unperturbed system, not to involve the time explicitly. 
Our problem has a meaning, of course, only provided the energy-levels 
of the unperturbed system are discrete and the differences between 
them are large compared with the changes in them caused by the 
perturbation. This circumstance results in the treatment of perturba- 
tion problems by the first method having some different features 
according to whether the energy-levels of the unperturbed system are 
discrete or continuous. 

Let the Hamiltonian of the perturbed system be 

H = E+V, (1) 

E being the Hamiltonian of the unperturbed system and V the small 
perturbing energy. By hypothesis each eigenvalue H' of H lies very 
close to one and only one eigenvalue E* of E. We shall use the same 
number of primes to specify any eigenvalue of H and the eigenvalue 
of E to which it lies very dose. Thus we shall have H' differing from 
E” by a small quantity of order V and differing from ' by a quantity 
that i* not small unless E f = E*. We must now take oare always to 
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use different numbers of primes to specify eigenvalues of H and E 
which we do not want to lie very close together. 

To obtain the eigenvalues of H, we have to solve the equation 

or (H'-E)\H'> = F|JT>. (2) 

Let |0> be an eigenket of E belonging to the eigenvalue E r and 
suppose the \H'\ > and H' that satisfy (2) to differ from |0> and E' 
only by small quantities and to be expressed as 

\E’> = |0>+ 11>+ |2>+..., 

H' = ^?'+a 1 +a 2 +..., 
where 1 1> and a x are of the first order of smallness (i.e. the same order 
as F), |2> and a 2 are of the second order, and so on. Substituting 
these expressions in (2), we obtain 

{E , -E-\-a 1 -{-a 2 +...}{\0y-{- |1>+ 12>+...} = F{|0>+ 11>+...}. 

If we now separate the terms of zero order, of the first order, of the 
second order, and so on, we get the following set of equations, 

(E'-E)\0> = 0, 

(E’-B) |l>+a 1 |0> = F|0>, 

(^'~^)|2>+a 1 |l>+a 2 |0> = F|l>, 

The first of these equations tells us, what we have already assumed, 
that |0> is an eigenket of E belonging to the eigenvalue E\ The others 
enable us to calculate the various corrections |1>, |2>,..., a v a 2 ,... . 

For the further discussion of these equations it is convenient to 
introduce a representation in which E is diagonal, i.e. a Heisenberg 
representation for the unperturbed system, and to take E itself as 
one of the observables whose eigenvalues label the representatives. 
Let the others, in the event of others being necessary, as is the case 
when there is more than one eigenstate of E belonging to any eigen- 
value, be called j8’s. £ basic bra is then (E”f? |. Since |0> is an 
eigenket of E belonging to the eigenvalue E\ we have 

<w io> = tjrwnn w 

where /(/F) is some function of the variables ft*. With the help of this 
result the second of equations (4), written in terms of representatives, 
becomes 

(E'-E”KE'P'\l>+a 1 h B . E ,f(P') = | <E^\V\E f p f >flfi f ). (6) 
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Putting E" — E' here, we get 

<hm = 2 (E'f? | V | E'j3 f yf(P'). (7) 

p 

Equation (7) is of the form of the standard equation in the theory 
of eigenvalues, so far as the variables {$' are concerned. It shows that 
the various possible values for are the eigenvalues of the matrix 
(E'p”\V\E'p'}. This matrix is g part of the representative of the 
perturbing energy in the Heisenberg representation for the unper- 
turbed system, namely, the part consisting of those elements that 
refer to the same unperturbed energy-level E' for their row and 
column. Each of these values for a x gives, to the first order, an energy- 
level of the perturbed system lying close to the energy-level E' of the 
unperturbed system. f There may thus be several energy-levels of the 
perturbed system lying close to the one energy-level E' of the unper- 
turbed system, their number being anything not exceeding the 
number of independent states of the unperturbed system belonging 
to the energy-level E'. In this way the perturbation may cause a 
separation or partial separation of the energy-levels that coinoide 
at E' for the unperturbed system. 

Equation (7) also determines, to the zero order, the representatives 
<E7P|0> of the stationary states of the perturbed system belonging 
to energy-levels lying close to E', any solution /(/?') of (7) substituted 
in (5) giving one such representative. Each of these stationary states 
of the perturbed system approximates to one of the stationaiy states 
of the unperturbed system, but the converse, that each stationary 
state of the unperturbed system approximates to one of the stationary 
states of the perturbed system, is not true, since the general 
stationary state of the unperturbed system belonging to the energy- 
level E' is represented by the right-hand side of (5) with an arbitrary 
function /(/?*). The problem of finding which stationary states of 
the unperturbed system approximate to stationary states of the 
perturbed system, i.e. the problem of finding the solutions /(/?') of 
(7); corresponds to the problem of ‘secular perturbations’ in classical 
mechanics. It should be noted that the above results are indepen- 
dent of the values of all those matrix elements of the perturbing 

t To distinguish these energy-levels one from an o ther we should require some 
more elaborate notation, sinoe according to the present notation they must all be 
specified by the same number of primes, namely by the number of primes specifying 
the energy-level of the unperturbed system from whioh they arise. For our present 
purposes, however, this more elaborate notation is not required. 
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energy which refer to two different energy-levels of the unperturbed 
system. 

Let us see what the above results become in the specially simple case 
when there is only one stationary state of the unperturbed system 
belonging to each energy-level.f In this case E alone fixes the repre- 
sentation, no j£Ps being required. The sum in (7) now reduces to a 
single term and we get 

= (E'\V\E , 'y. (8) 

There is only one energy-level of the perturbed system lying close to 
any energy-level of the unperturbed system and the change in energy | 
is equal , in the first order , to the corresponding diagonal element of the 
perturbing energy in the Beisenberg representation for the unperturbed 
system, or to the average va lue of the yerturbing energy for the correspond- 
ing unperturbed state. The latter formulation of the result is the same 
as in classical mechanics when the unperturbed system is multiply 
periodic. 

We shall proceed to calculate the second-order correction a a in 
the energy-level for the case when the unperturbed system is non- 
degenerate. Equation (5) for this case reads 


<^| 0 > = 

with neglect of an unimportant numerical factor, and equation (6) 
reada (F- J*)<Jril>+«|8„ = (E*\V\E'y. 

This gives us the value of <£"[!) when E“ ^ E' , namely 


<**| 1 > = 


cg*irijr) 


(») 


The third of equations (4), written in terms of representatives, 
becomes 

(E'-E")(E"\‘2,'>+a 1 <.E"\\'>+a t h irs . = J (E’\V\E m y(E m \iy. 
Putting E" = E' here, we get 

o 1 <£'|l>+o 1 = J<JP'|F|*"><S"|1>, 

which reduoes, with the help of (8), to 

' a t= E % E < E '\ v \ E '>< E "\ 1 >- 

f A system withonly one stationary state belonging to each energy -level is often 
called non-degenerate and one tfith two or more stationary states belonging to an 
energy-level is called degenerate, although these words are not very appropriate from 
the modem point of view. 
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Substituting for <JE*|1> from (9), we obtain finally 
V (E'\VIE''>(E’IVIE'> 

' h ~A *-*. 

giving for the total energy change to the second order 

«,+<?= <»if i»>+ 7 


z, 

E*¥‘E > 
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( 10 ) 


The method may be developed for the calculation of the higher 
approximations if required. General recurrence formulas giving the 
nth order corrections in terms of those of lower order have been 
obtained by Bom, Heisenberg, and Jordan. | 


44 . The perturbatioa considered as causing transitions 

We shall now consider the second of the two perturbation methods 
mentioned in § 42. We suppose again that we have an unperturbed 
system governed by a Hamiltonian E which does not involve the 
time explicitly, and a perturbing energy V which can now be an 
arbitrary function of the time. The Hamiltonian for the perturbed 
system is again H = E-\-V . For the present method it does not 
make any essential difference whether the energy-levels of the 
unperturbed system, i.e. the eigenvalues of E, form a discrete or 
continuous set. We shall, however, take the discrete case, for 
definiteness. We shall again work with a Heisenberg representation 
for the unperturbed system, but as there will now be no advantage in 
taking E itself as one of the observables whose eigenvalues label the 
representatives, we shall suppose we have a general set of a’s to label 
the representatives. 

Let us suppose that at the initial time t Q the system is in a state for 
which the a’s certainly have the values a'. The ket corresponding to 
this state is the basio ket la'). If there were no perturbation, i.e. if the 
Hamiltonian were E, this state would be stationary. The perturba- 
tion causes the state to change. At time t the ket corresponding to the 
[state in Sohrodinger’s picture will be T|a'>, according to equation ( 1 ) 
(of |^27. The probability of the a’s then having the values of is 

P(aV) = |<a'|T|a')|*. (11) 

Fof of 7 ^ a', P(a'a') is the probability of a transition taking plaoe 
from state a' to state of during the time interval t, while P(a'a') 

t Z.f* Phytic, 35 (1985), 565. 
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is the probability of no transition taking plaoe At al|» The Sum of 
P(olol) for all of is, of course, unity. 

Let us now suppose that initially the system, instead of being 
certainly in the state a', is in one or other of various states a' with 
the probability P^ for each. The Gibbs density corrq®onding to this 
distribution is, according to (08) of § 33 

p = 2 W>P«W\- (12) 

a' 

At time t, each ket |a'> will have changed to T|a'> and each bra <a'| 
to < ol'\T , so p will have changed to 

j ft = | Twyp^'w. (is) 

The probability of the a’s then having "the values of will be, from 
(73) of § 33, <ot'|ft|a'> = T ^PIoOP^'lSV) 

C? 

= ^P^P(«V) (14) 


with the help pf (11). This result expresses that the probability of 
the system being in the state of at time t is the sum of the probabilities 
of the system being initially in any state a' ^ a*, and making a transi- 
tion from state a to state of and the probability of its being initially 
in the stateV and making no transition. Thus the various transition 
probabilities act independently of one another, according to the 
ordinary laws 6f probability. 

The whole problem of calculating transitions thus reduces to the 
determination of the probability amplitudes {of | T |a'>. These can be 
worked out from the differential equation for T t equation (6) of § 27, or 


ihdT/dt = HT = (E+V)T. (15) 


The calculation can be simplified by working with 

T * = e***-WT. ( 10 ) 

We have ihdT*/dt = e^^^ET+ihdT/dt) 

= eW-WVT = F*T*, (17) 

where F* = (18) 

* % 

i.e. V* is the result of applying a oertain unitary transformation to F. 
Equation (17) is of a more convenient form than (15), because (17) 
'makes the change in T* depend entirely on the perturbation F, and 
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for F = 0 it would make T* equal its initial value, namely unity. 
We have from (16) 

<e/|3CV> = * 

so that P(«V) = |<«'| T*\*'>\\ (19) 

showing that P* and T are equally good for determining transition 
probabilities. 

Qur work up to the present has been exact. We now assume V is 
a small quantity of the first order and express T* in the form 

T* = l+Tf+5Py+..., (20) 

where T* is of the first order, T* is of the second, and so on. Substi- 
tuting (20) into (17) and equating terms of equal order, we get 


ihdT*ldt = F*, 
%hdT%\dt = F*Tf, 


( 21 ) 


From the first of these equations we obtain 

t 

T* m- 1 J F*(<') dt\ (22) 

from the second we obtain 

t v 

T* = -«-* J V*(?) dx' J F*(t') dt\ (23) 

and so on. For many practical problems it is sufficiently accurate to 
retain only the term Tf, which gives for the transition probability 
P(aV) with <x" ol 


P(a'ot') = «-* 

<a'| J F*(F) 

1 ' 

= *-* 

j <a'|F*(t')|a'> dt'l 

u 

■ / 

j 


We obtain in this way the transition probability to the second order 
of aocuraoy. The result depends only on the matrix element 
<fl^F*(t')l«'> of F*($') referring to the two states concerned, with t* 
goi||*from t # to t. Since F* is real, like F, 

<«'|F*(t')l«'> - <a , |F*(( , )|«*> 
and henoe P( a'«') = P(«V) 

to &e second order of accuracy. 


( 25 ) 
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Sometimes one is interested in a transition a -> a” such that the 
matrix element <a*|F*|a'> vanishes, or is small compared with other 
matrix elements of F*. It is then necessary to work to a higher 
acouraoy. If we retain only the terms Tf and TJ, we get, for a* =£ at', 

P(«V) = «-»| J <a’|F*(t')|a'> dt'- 

-ih- 1 T f <o t '|F*(OI«" , >*' f <oi w |7*(OI«'>*'r. (26) 

ot»+<x fjaf J J 

u u 

The terms of = a' and a w = a* are omitted from the sum since they 
are small compared with other terms of the sum, on account of the 
smallness of <a* |F*|a'>. To interpret the result (26), we may suppose 
that the term t 

JV|F*(<')|«'>*' (27) 

gives rise to a transition directly from state ot to state at*, while the 
term t r 

-a- 1 J <a'|F*(t')K> dt' J < a w |F*(m«'> (28) 

gives rise to a transition from state a to state of, followed by a 
transition from state of to state a*. The state of is called an inter- 
mediate state in this interpretation. We must add the term (27) to the 
various terms (28) corresponding to different intermediate states 
and then take the square of the modulus of the sum, whioh means 
that there is interference between the different transition processes — 
the direct one and those involving intermediate states — and one can- 
not give a meaning to the probability for one of these processes by 
itself. For each of these processes, however, there is a probability 
amplitude. If one carries out the perturbation method to a higher 
degree of accuracy, one obtains a result whioh can be interpreted 
similarly, with the help of more complicated transition processes 
involving a succession of intermediate states. 

45. Application to radiation 

In the preoeding section a general theory of the perturbation of an 
atomic system Was developed, in whioh the perturbing energy could 
vary with the time in an arbitrary way. A perturbation of this 
kind can be realized in practice by allowing incident electromagnetic 
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radiation to fall on the system. Let us see what our result (24) reduces 
to in this case. 

If we neglect the effects of the magnetic field of the incident radia- 
tion, and if we further assume that the wave-lengths of the harmonic 
components of this radiation are all large compared with the dimen- 
sions of the atomic system, then the perturbing energy is simply the 
scalar product 7 = (D, €), (29) 

where D is the total electric displacement of the system and £ is 
the electric force of the incident radiation. We suppose £ to be a 
given function of the time. If we take for simplicity the case when 
the incident radiation is plane polarized with its electric veotor in 
-a certain direction and let D denote the Cartesian component of D 
in this direction, the expression (29) for V reduces to the ordinary 
product y = De 

where 6 is the magnitude of the vector £. The matrix elements of 

Fa ^ <«'|F|«'> = <at'|D|« , >fi, 

fiance £ is a number. The matrix element <a*|Z)|a'> is independent 
of t. From (18) 

<a <r |F*(f)|a , > = 

and hence the expression (24) for the transition probability becomes 

P(fltV) = fcJ|<«'|D|«'>|*| j e w-mr-vihe(t') (30) 


If the incident radiation during the time interval t 0 to t is resolved 
into its Fourier components, the energy crossing unit area per unit 
frequency range about the frequency v will be, according to classical 
electrodynamics, t 


-£|/ 


e**«r-ve{f) (&' I 


Comparing this with (30), we obtain 


P( otV) = 2wC _1 X | <a* 1 2) I at'> | *E r , (32) 

where v = \E’-E'\lh. .(33) 

From this result we see in the first place that the transition proba- 
bility depends only on that Fourier component of the incident radia- 
tion whose frequency v is connected with the changfe of energy by (33). 
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This gives us Bokr^ J^reque n cy Condition and shows how the ideas 
of Bohr’s atomio theory, which was the forerunner of quantum 
mechanics, can be fitted in with quantum mechanics. 

The present elementary theory does not tell us anything about the 
energy of the field of radiation. It would be reasonable to assume, 
though, that the energy absorbed or liberated by the atomio system 
in the transition process comes from or goes into the component of 
the radiation with frequency v given by (33). This assumption will 
be justified by the more complete theory of radiation given in 
Chapter X. The result (32) is then to be interpreted as the proba- 
bility of the system, if initially in the state of lower energy, absorb- 
ing radiation and being carried to the upper state, and if initially in 
the upper state, being stimulated by the incident radiation to emit 
and fall to the lower state. The present theory does not account for 
the experimental fact that the -system, if in the upper state with no 
incident radiation, can emit spontaneously and fall to the lower state, 
but this also will be accounted for by the more complete theory of 
Chapter X. 

The existence of the phenomenon of stimulated emission was in- 
ferred by Einstein, f long before the discovery of quantum mechanics, 
from a consideration of statistical equilibrium between atoms and a 
field of black-body radiation satisfying Planck’s law. Einstein showed 
that the transition probability for stimulated emission must equal 
that for absorption between the same pair of states, in agreement 
with the present quantum theory, and deduced ^Jso a relation con- 
necting this transition probability with that fo^pontaneous emission, 
which relation is in agreement with the theory of Chapter X. 

The matrix element <a* |D|a'> in (32) plays the part of the ampli- 
tude of one of the Fourier components of D in the classical theory of 
| a multiply-periodic system interacting with radiation. In fact it was 
the idea of replacing classical Fourier components by matrix elements 
which led Heisenberg to the discovery of quantum mechanics in 1925. 


Heisenberg assumed that the formulas describing the interaction with 
radiation of a system in the quantum theory can be obtained from 
the clftBsifi«.1 formulas by substituting for the Fourier components of 
the total electric displacement of the system the corresponding matrix 


dements. According to this assumption applied to spontaneous emis- 


sion, a system having an electric moment D will, when in the state 


8BU.57 


| Einstein, Phye. Zcits. 18 (1917), 121. 
K 
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a', spontaneously emit radiation of frequency v = (E'—E*)/h t where 
IT is an energy-level, less than E\ of some state a*, at the rate 

i^i<“'| D |«'>r (34) 

The distribution of this radiation over the different directions of 
emission and its state of polarization for eaoh direction will be the 
same as that for a classical electric dipole of moment equal to the 
real part of <a*|D|a'>. Toiiterpret this rate of emission of radiant 
energy as a transition probability, we must divide it by the quantum 
of energy of this frequency, namely hv, and call it the probability per 
unit time of this quantum being spontaneously emitted, with the 
„ atomic system simultaneously dropping to the state a of lower 
energy. These assumptions of Heisenberg are justified by the present 
radiation theory, supplemented by th^ spontaneous transition theory 
of Chapter X. 


46. Transitions caused by a perturbation independent of the 
time 

The perturbation method of § 44 is still valid when the perturbing 
energy F does not involve the time t explicitly. Since the total 
Hamiltonian H in this case does not involve t explicitly, we oould 
now, if desired, deal with the system by the perturbation method of 
§ 43 and find its stationary states. Whether this method would be 
convenient or not would depend on what we want to find out about 
the system. If wh§t we haW to calculate makes an explicit reference 
h to the time, e.g. if we fcave to calculate the probability of the system 
beingin a certain state at one time when we are given that it is in a 
certain state at another time, the method of § 44 would be the more 
^convenient one. 


Let us see what the result (24) for the transition probatf lity becomes 
risen F does $Lot javolye t explicitly and let us take t 0 = 0 to simplify 
le matrix element <a'|Ffa*> is now independent of t. 



) 




I 


(35) 


/ <«'|F*(t')|«'> *' - 


prodded jji* E'. TUhs the transition probability (24) becomes 
PLy) = Ty(E'-E')' 

■ V = 2|<a*|F|a'>|*[l4*-oos{(£'— J?')*. (38) J 
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If E* differs appreciably from E' this transition probability is 
land remains so for all values of t. This result is required by the law 
iof the conservation of energy. The total energy H is constant and 
hence the proper-energy E (i.e. the energy with neglect of the part . 
V due to the perturbation), being approximately equal to. JET, must 
be approximately constant. This means that if E initially has the 
numerical value E', at any later time there must be only a small 
probability of its having a numerical yalue differing considerably 
from E’. "* 

On the other hand, when the initial state ot! is such that there exists 
another statq of having, the same or very nearly the same proper- 
energy E, the probability of a transition to the final state a” may be 
quite large. The case of physical interest now is that in whioh there 
is a continuous range of final states a W having a continuous range of 
proper-energy levels E" pas&Ldg through the value W of the proper- 
energy of the initial state. The initial state must not be one of the 
continuous range of final states, but may be either a separate discrete 
state or one of another continuous range of states. We shall now have, 
remembering the rules of § IB for the interpretation of probability 
amplitudes with continuous ranges of states, that, with P(olol) 
having the value (36), the probability of a transition to a final state 
within the small range ol to oL+doL will be P(aV) dot" if the initial 
state ol is discrete and will be proportional to this quantity if a' is 
one of a continuous range. 

We may suppose that the a’s describing the fpial state consist of 
E together with a number of other dynamical variables 0, so that we 
have a representation like that of § 43 for the degenerate case. (The 
0’s, however, need have no meaning for the initial state a'.) We shall 
suppose for definiteness that the 0’s have only discrete eigenvalues. 
Thetotal probability of a transition to a fina| state of for whioh the 
0’s have the values 0* and E has any value (tha$e will be a strong 
probability of its having** walue near the initial valuqjfl') will now 
be (or be proportional to) . 

J P(aV) dE* 

= 2 f | <2£ # 0" | V |a' > |*[I -QOB{(E”’-E')tlft}]l(E*-— E') % dE 0 (#) 

Joo * t 

= m - 1 j | (E' +&r/f , j9* | F |a'> |*[1 — oos aj]/x* dx 
— 00 

J fL'-t')*- * 
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if one makes the substitution (E*—E')t/h = x. For lajrge values of t 
this reduces to 

00 

2«- 1 |<JS'j8'|F|o^>|* J [1— cob x]/x*dx 

= 2*th- 1 \<E’P’\V\c t ’'>\*. (38) 

Thus the total probability up to time £ of a transition to a final state 
for which the j8*s have the values ft” is proportional to t. There is 
therefore a definite probability coe^cimt, or probability per unit time, 
for the transition prooess under consideration, having the value 


2nh-i\<E'P'\V\ac'>\*. (39) 

It is proportional to the square of the modulus of the matrix element, 
associated with this transition, of the perturbing energy. 

If the matrix element (E'f? |F|a'> is small oompared with other 
matrix elements of F, we must work with the more accurate formula 
(20). We have from (35) * 

t r 

I f <a m IV*(nb'> dt" 

0 0 

t V 

= <a'|F|a w ><a w |F|a , > J dt' J eW-BTtK dt' 

0 0 

= J {e«r-'inw- e «s--ETiA ] M. 

0 

For E" close to E\ only the first term in the integrand here gives rise 
to a transition probability of physical importance and the second 
term may be discarded. Using this result in (26) we get 
P( «V) 

= sIav'ifu's- V <^|F| a r><cr|FK>|»i-ooBf(j > -tyx/ft 

\ tr-v \ (E’-Ey * 

which replaces (36). Proceeding as before, we obtain for the transi- 
tion probability per unit time to a final state for which the jS’s have 
the values and E has a value close to its initial value E f * 


2W 

T 


<w k>- 2 m 


a”+a’& m 


, differing from the initial 


state and final state, play a role in the determinationbf a probability 
coeJSptant. 
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In order that the approximations used in deriving (39) and (40) may 
be valid, the time t must be not too small and not too large. It must 
be large compared with the periods of the atomic system in order that 
the approximate evaluation of the integral (37) leading to the result 
(38) may be valid, while it must not be excessively large or else the 
general formula (24) or (26) will break down. In fact one could make 
the probability (38) greater than unity by taking t large enough. The 
upper limit to t is fixed by the condition that the probability (24) or 
(26), or t times (39) or (40), must be small compared with unity. There 
is no difficulty in t satisfying both these conditions simultaneously 
provided the perturbing energy V is sufficiently small 

47. The anomalous Zeeman effect 

One of the simplest examples of the perturbation method of § 43 
is the calculation of the first-order change in the energy -levels of an 
atom caused by a uniform magnetic field. The problem of a hydrogen 
atom in a uniform magnetic field has already been dealt with in § 41 
and was so simple that perturbation theory was unnecessary. The 
case of a general atom is not much more complicated when we make 
a few approximations such that we can set up a simple model for the 
atom. 

We first of all consider the atom in the absence of the magnetic 
field and look for constants of the motion or quantities that are 
approximately constants of the motion. The total angular momen- 
tum of the atom, the vector j say, is certainly a constant of the 
motion. This angular momentum may be regarded as the sum of two 
parts, the total orbital angular momentum of all the electrons, I say, 
and t|| total spin angular momentum, 8 say. Thus we have j = 1+8. 
NowTO.£ effect of the spin magnetic moments on the motion of the 
electrons is httulII compared with the effect of the Coulomb forces and 
may be neglected as a first approximation. With this approximation 
the spin ang" 1 ** momentum of each electron is a constant of the 
motion, there being no forces tending to change its orientation. Thus 
8, and hftnce also 1, will be constants of the motion. The magnitudes, 
l , 8 , and j say, of 1, 8, and j will be given by 

l+& - (S-HR-rt+W. 

. j+frr 01+iS+tf+i**)*’ 
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corresponding to equation (39) of § 36. They comjnute with each 
other, and from (47) of § 36 we see that with given numerical values 
for l and a the possible numerical values for j are 
1+8 , 1+8— h, ..., | 1 — *|. 

Let us consider a stationary state for which l, 8 , and j have definite 
numerical values in agreement with the above scheme. The energy 
of this state will depend on l, but one might think that with neglect 
of the spin magnetic moments it would be independent of s, and 
also of the direction of the vector 8 relative to 1, and thus of j. It will 
be found in Chapter IX, however, that the energy depends very much 
on the magnitude a of the vector 8, although independent of its 
direction when one neglects the spin magnetic moments, on aocount 
of certain phenomena arising from the foot that the electrons are 
indistinguishable one from another. There are thus different energy- 
levels of the system for each different value of l and a. This means 
that l and a are functions of the energy, according to the general 
definition of a function given in § 11, since the l and a of a stationary 
state are fixed when the energy of that state is fixed. 

We can now take into account the effeot of the spin magnetic 
moments, treating it as a small perturbation according to the method 
of„§ 43. The energy of the unperturbed system will still be approxi- 
mately a constant of the motion and hence l and s, being functions 
of this energy, will still be approximately constants of the motion. 
The directions of the vectors 1 and s, however, not being functions of 
the unperturbed energy, need not now be approximately constants 
of the motion and may undergo large secular variations. Since the 
* vector j is constant, the only possible variation of 1 and 8 is a pre- 
cession about the vector j. We thus have an approximate model of 
the atom consisting of the two vectors 1 and 8 of constant lengths 
processing about their sum j, which is a fixed vector. The energy is 
determined mainly by the magnitudes of 1 and 8 and depends only 
slightly on their relative directions, specified by j. Thus states with 
the same l and a and different j will have only slightly different 
energy-levels, forming what is called a muUiplet term. 

<Let us now take this atomic model as our unperturbed system and 
suppose it to be subjected to a uniform magnetic field of magnitude & 
in tiie direction of the z-axis. *The extra energy due to this magnetic 
field will consist of a term 

eJf/2mc . (m,+far g ), 


dD 
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like the last term in equation (89) of § 41, contributed by each 
electron, and will thus be altogether 

eM/Zmc. ^ (Wfc+fo’*) = eJtl2mc.(l B +28 e ) = eJt/2mc. (j B +8 B ). (42) 
This is our perturbing energy V. We shall now use the method of 
§ 43 to determine the changes in the energy-levels caused by this V. 
The method will be legitimate only provided the field is so weak that J 
V is small compared with the energy differences within a multiplet. I 
Our unperturbed system is degenerate, on acoount of the direction 
of the vector j being undetermined. We must therefore take, from 
the representative of V in a Heisenberg representation for the un- 
perturbed system, those matrix elements that refer to one particular 
energy-level for their row and column, and obtain the eigenvalues of 
the matrix thus formed. We can do this best by first splitting up V 
into two parts, one of which is a constant of the unperturbed motion, 
so that its representative contains only matrix elements referring to 
the same unperturbed energy-level for their row and column, while 
the representative of the other contains only matrix elements refer- 
ring to two different unperturbed energy-levels for their row and 
column, so that this second part does not affect the first-order per- 
turbation. The term involving j B in (42) is a constant of the un- 
perturbed motion and thus belongs entirely to the first part. For the 
term involving 8 Z we have 


— Jt( 8 x3x 8 y3v 8 eJe) “f" ( 8 sJx 3t 8 x)jx~^~ ( 8 ejy ja 8 y)jy 


or 




where 


Yx = Wy—Wy = s Jy—h 8 v = l y 8 t^h 8 y> 
Yy=i* 8 x- 8 »5x = h 8 x~ 8 B l x = l * 8 x—l>x 8 *- k 



The first term in this expression for s,isa constant of the unperturbed 
motion and thus belongs entirely to the first part, while the second 
term, as we shall now see, belongs entirely to the second part. 
Corresponding to (44) we can introduce 

Y» ~ ^x 8 y ly 8 x' 

It can now easily be verified that 


and from (30) of § 35 


jxYx+JyYy+jiYM = 0 


Um> Yx] = Yy> Yy] = -Yx, | J»> yj = 0. 
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These relations connecting j xi j yi j e and y x , y v , y e are of the same form 
as the relations connecting m x , m y) m B and x>y,z in the calculation 
in § 40 of the selection rule for the matrix elements of z in a repre- 
sentation with k diagonal. From the result there obtained that all 
matrix elements of z vanish except those referring to two k values 
differing by we can infer that all matrix elements of y e , and 
similarly of y x and y yi in a representation with j diagonal, vanish 
except those referring to two j values differing by The coeffi- 
cients of y x and y v in the second term on the right-hand side of (43) 
commute with j t so the representative of the whole of this term will 
contain only matrix elements referring to two j values differing by 
and thus referring to two different energy -levels of the unper- 
turbed system. 

Hence the perturbing energy V becomes, when we neglect that 
part of it whose representative consists of matrix elements referring 
to two different unperturbed energy -levels, 

^ 2 j(j+h) 

The eigenvalues of this give the first-order changes in the energy- 
levels. We can make the representative of this expression diagonal 
by choosing our representation such that j e is diagonal, and it then 
gives us directly the first-order changes in the energy-levels caused by 
the magnetic field. This expression is known as Land6’s formula. 

The result (45) holds only provided the perturbing energy V is small 
oompared with the energy differences within a multiplet. For larger 
'values of V a more complicated theory is required. For very strong 
fields, however, for which V is large oompared with the energy differ- 
ences within a multiplet, the theory is again very simple. We may 
now neglect altogether the energy of the spin magnetic moments for 
the atom with no external field, so that for our unperturbed system 
the vectors 1 and 8 themselves are constants of the motion, and not 
merely their magnitudes l and s. Our perturbing energy F, which is 
still eJ¥/2mc.(j g +s g ), is now a oonstant of the motion for the unper- 
turbed system, so that its eigenvalues give directly the changes in the 
energy-levels. These eigenvalues are integral or half-odd integral 
multiples of eJ¥h/2mc according to whether the number of electrons 
in the atom is even, or odd. 
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48. General remarks 

In this chapter we shall investigate problems connected with a par- 
ticle which, coming from infinity, encounters or ‘collides with* some 
atomic system and, after being scattered through a certain angle, goes 
off to infinity again. The atomic system which does the scattering 
we shall call, for brevity, the scatter er. We thus have a dynamical 
system composed of an incident particle and a scatterer interacting 
with each other, which we must deal with according to the laws of 
quantum mechanics, and for which we must, in particular, calculate 
the probability of scattering through any given angle. The scatterer 
is usually assumed to be of infinite mass and to be at rest throughout 
the scattering process. The problem was first solved by Bom by a 
method substantially equivalent to that of the next section. We must 
take into account the possibility that the scatterer, considered as a 
system by itself, may have a number of different stationary states 
and that if it is initially in one of these states when the particle arrives 
from infinity, it may be left in a different one when the particle goes 
off to infinity again. The colliding particle may thus induce transi- 
tions in the scatterer. 

The Hamiltonian for the whole system of scatterer plus particle 
will not involve the time explicitly, so that this whole system will 
have stationary states represented by periodic solutions of Sohro- 
dinger’s wave equation. The meaning of these stationary states 
requires a little care to be properly understood. It is evident that 
for any state of motion of the system the particle will spend nearly all 
its time at infinity, so that the time average of the probability of the 
particle being in any finite volume will be zero. Now for a stationary 
state the probability of the particle being in a given finite volume, 
like any other result of observation, must be independent of the time, 
and hence this probability will equal its time average, which we have 
seen is zero. Thus only the relative probabilities of the partiole being 
in different finite volumes will be physioally significant, their absolute 
values being all zero. The total energy of the system has a continuous 
range of eigenvalues, since the initial energy of the particle can be 
anything. Thus a ket, |*> say, corresponding to a stationary Btate, 
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being an eigenket of the total energy, must be of infinite length. We 
oan see a physical reason for this, sinoe if |s> were normalized and if 
Q denotes that observable — a certain function of the 'position of 
the particle — that is equal to unity if the particle is in a given finite 
volume and zero otherwise, then <s|Q|s> would be zero, meaning that 
the average value of Q, i.e. the probability of the particle being in the 
given volume, is zero. Such a ket \s} would not be a convenient one 
to work with. However, with |*> of infinite length, <$|Q|s> can be 
finite and would then give the relative probability of the particle 
being in the given volume. 

In picturing a state of a system corresponding to a ket |a?> which 
is not normalized, but for whioh (x\x} = n say, it may be convenient 
to suppose that we have n similar systems all occupying the same 
spadb but with no interaction between them, so that each one follows 
out its own motion independently of the others, as we had in the 
theory of the Gibbs ensemble in § 33. We can then interpret <a?|a|a:>, 
where a is any observable, directly as the total a for all the n systems^ 
In applying these ideas to the above-mentioned |*> of infinite length, 
corresponding to a stationary state of the system of scatterer plus 
colliding particle, we should picture an infinite number of suoh sys- 
tems with the scatterers all located at the same point and the particles 
distributed continuously throughout space. The number of particles 
in a given finite volume would be pictured as <s|Q|a>, Q being the 
observable defined above, which has the value unity when the particle 
is in the given volume and zero otherwise. If the ket is represented 
by a Schrodinger wave function involving the Cartesian coordinates 
of the particle, then the square of the modulus of thetwave function 
could be interpreted directly as the density of particles in the picture. 
One must remember, however, that each of these particles has Us own 
individual scatterer. Different particles may belong to scatterers in 
different states. There will thus be one particle density for each state 
of the scatterer, namely the density of those particles belonging to 
scatterers in that state. This is taken account of by the wave function 
involving variables describing the state of the scatterer in addition 
to those describing the position of the particle. 

For determining scattering coefficients we have to investigate 
stationary slates of the whole system of scatterer plus partiole. For 
instanoe, if we want to determine the probability *of scattering in 
various directions when the soatterer is initially in a given stationary 
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state and the incident particle has initially a given velocity in a given 
direction, we must investigate that stationary state of the whole 
system whose picture, according to the above method, oontains at 
great distances from the point of location of the scatterers only 
particles moving with the given initial velocity and direction and 
belonging each to a scatterer in the given initial stationary state, 
together with particles moving outward from the point of looation 
of the scatterers and belonging possibly to scatterers in various 
stationary states. This picture corresponds closely to the actual state 
of affairs in an experimental determination of scattering coefficients, 
with the difference that the picture really describes only one actual 
system of scatterer plus particle. The distribution of outward moving 
particles at infinity in the picture gives us immediately all the infor- 
mation about scattering coefficients that could be obtained by experi- 
ment. For practical calculations about the stationary state described 
by this pioture one may use a perturbation method somewhat like 
that of § 43, taking as unperturbed system, for example, that for 
which there is no interaction between the scatterer and particle. 

In dealing with collision problems, a further possibility to be taken 
into consideration is that the scatterer may perhaps be oapable of 
absorbing and re-emitting the particle. This possibility arises when 
there exists one or more states of absorption of the whole system, a 
state of absorption being an approximately stationary state whioh 
is closed in the sense mentioned at the end of § 38 (i.e. for whioh 
the probability of the particle being at a greater distance than r from 
the scatterer tends to zero as r -> oo). Since a state of absorption is 
only approximately stationary, its property of being closed will be 
only a transient one, and after a sufficient lapse of time there will be 
a finite probability of the particle being on its way to infinity. 
Physically this means there is a finite probability of spontaneous 
emission of the particle. The fact that we had to use the word 
1 approximately’ in stating the conditions required for the phenomena 
of emission and absorption to be able to occur shows that these condi- 
tions are not expressible in exact mathematical language. One oan give 
a meaning to these phenomena only with reference to a perturbation 
method. They occur when the unperturbed system (of scatterer plus 
particle) has stationary states that are closed. The introduction of the 
perturbation spoils the stationary property of these states and gives 
rise to spontaneous emission and its converse absorption. 
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For calculating absorption and emission probabilities it is neoessary 
to deal 'with non-stationary states of the system, in contradistinction 
to the case for scattering coefficients, so that the perturbation method 
of § 44 mpst be used. Thus for calculating an emission coefficient 
we must consider the non-stationary states of absorption described 
above. Again, sinoe an absorption is always followed by a re-emission, 
it cannot be distinguished from a scattering in any experiment in- 
volving a steady state of affairs, corresponding to a stationary state 
of the system. The distinction can be made only by reference to a 
non-steady state of affairs, e.g. by use of a stream of incident particles 
that has a sharp beginning, so that the scattered particles will appear 
immediately after the incident particles meet the scatterers, while 
those that have been absorbed and re-emitted will begin to appear 
only some time later. This stream of particles would be the picture 
of a oertain ket of infinite length, which could be used for calculating 
the absorption coefficient. 

49. The scattering coefficient 

We shall now consider the calculation of scattering coefficients, 
taking first the case when there is no absorption and emission, which 
means that our unperturbed system has no closed stationary states. 
We may conveniently take this unperturbed system to be that for 
which there is no interaction between the scatterer and particle. Its 
Hamiltonian will thus be of the form 

E = H 8 +W 9 (1) 

where H 8 is that for the scatterer alone and W that for the particle 
alone, namely, with neglect of relativistic mechanics, 

W - l/2m.(p*+pj+pf). (2) 

The perturbing energy F, assumed small, will now be a funotion of 
the Cartesian coordinates of the particle x , y, z, and also, perhaps, 
of its momenta p x , p yt p v together with dynamical variables describ- 
ing the soatterer. 

Sinoe we are now interested only in stationary states of the whole 
system, we use a perturbation method like that of § 43. Our unper- 
turbed system now necessarily has a continuous range of energy- 
levels, sinoe it contains a free particle, and this gives rise to oertain 
modifications in the perturbation method. The question of the ohange 
in t&e energy-levels oaused by the perturbation, which was the main 
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question of § 43, no longer has a meaning, and the convention in § 43 
of using the same number of primes to denote nearly equal eigen- 
values of E and H now drops out. Again, the splitting of energy- 
levels which we had in § 43 when the unperturbed system is degenerate 
cannot now arise, since if the unperturbed system is degenerate the 
perturbed one, which must also have a continuous range of energy- 
levels, will also be degenerate to exactly the same extent. 

We again use the general scheme of equations developed at the 
beginning of § 43, equations (1) to (4) there, but we now take our 
unperturbed stationary state forming the zero-order approximation 
to belong to an energy-level E' just equal to the energy-level H' of 
our perturbed stationary state. Thus the a’s introduced in the second 
of equations (3) § 43 are now all zero and the second of equations 
(4) there now reads (E'—E)\iy = 7|0>. (3) 

Similarly, the third of equations (4) § 43 now reads 

(E'-E) |2> = F|l>. (4) 


We shall proceed to solve equation (3) and to obtain the scattering 
coefficient to the first order. We shall need equation (4) in § 51. 

Let a denote a complete set of commuting observables describing 
the soatterer, which are constants of the motion when the scatterer is 
alone and 'may thus be used for labelling the stationary states of the 
soatterer. This requires that U 8 shall oommute with the a’s and be 
a function of them. We can now take a representation of the whole 
system in which the a’s and x, y, z, the coordinates of the particle, 
are diagonal. This will make H B diagonal. Let |0> be represented by 
<xa'|0> and |1> by <xa # |l>, the single variable x being written to 
denote x, y, z and the prime being omitted from x for brevity. In the 
same way the single differential d 8 x will be written to denote the 
product docdydz . Equation (3), written in terms of representatives, 
becomes, with the help of (1) and (2), 


{E'-H.(*')+h*l%m . V ! }<Xa'|l> = J J <Xa'|F|xV>d»X*<xV|0>. 


(«) 

Suppose that the incident particle has the momentum p° and that 
the initial stationary state of the scatterer is a°. The stationary state 
of our unperturbed system is now the one for which p = p° and 
a = a°, and hence its representative is 


<Xa'|0> = 8^e^\ (0) 
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This makes equation (5) reduce to 

{E'-H,(a!)+K t l%m • V*}<Xa' 1 1 > = J <x a '|F|x°ot # > <Px?e i » , ‘*W 
or (**+V*)<x<*'|l> = F, (7) 

where k* = 2 mh~*{E'-H t (<x')} (8) 

and F= J <xa'|F|x°a°> d*x?e«**V* (9) 

b definite function of x, y, z , and a'. We must also have 

= li^a 0 ) + p° a /2w. (10) 


Our problem now is to obtain a solution <xa'|l> of (7) which, for 
values of x 9 y, z denoting points far from the scatterer, represents 
>nly outward moving particles. The square of its modulus, | <xa' |1)| 2 , 
prill then give the density of scattered particles belonging to scatterers 
in the state a' when the density of the incident particles is |<xa°|0>| 2 , 
whioh is unity. If we transform to polar coordinates r, 0, <f>, equation 
(7) becomes ; T «o<) 




a . . d 

^am0- 


a a 


,j<r^«'|l> = F. (11) 


r 2 sina aa“ aa r 2 sin a a a<£ a | 

Now F must tend to zero as r -> oo, on account of the physical re- 
quirement that the interaction energy between the scatterer and 
particle must tend to zero as the distance between them tends to 
infinity. If we neglect F in (11) altogether, an approximate solution 
for large r is < r ^ a ' |i> = «(tyot')r-V* (12) 


where u is an arbitrary function of 0, <f>, and a', since this expression 
substituted in the left-bind side of (11) gives a result of order r-». 
When we do not neglect F, the solution of (11) will still be of the 
form (12) for large r, provided F tends to zero sufficiently rapidly as 
hut the funotion u will now be definite and determined by the 
solution for smaller values of r. 

•Tor values a.' of the a’s suoh that Jfc*. defined by (8), is positive, the 
k in (12) must be. chosen to be the positive square root of 4*, in order 
that (12) may represent only outward moving particles, i.e. particles 
fin whioh the radial component of momentum, which from § 38 
equals p r — iJSt- 1 or — »X(S/Sr+r-y, has a positive value. We now 
>have that the density of scattered particles belonging to Bcatterers in 
state a', equal to the square of the modulus of (l2), fells off with 
iliereaeing r aocopding to the . inverse square law, as is physically 
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necessary, and their angular distribution is given by |ti(0^a')| 2 . 
Further, the magnitude, P' say, of the momentum of these scattered 
particles must equal kh, the momentum being radial for large r, 
so that their energy is equal to 


pt% 

2m 


m 2 

2m 




with the help of (8) and (10). This is just the energy of an incident 
particle, namely p° a /2m, reduced by the increase in energy of the 
scatterer, namely H 8 (oc)—H 8 (oP) } in agreement with the law of con- 
servation of energy. For values a' of the a’s suoh that h 2 is negative 
there are no scattered partioles, the total initial energy being insuffi- 
cient for the soatterer to be left in the state a'. 

We' must now evaluate u(d<j>ot) for a set of values ol for the a’s suoh 
that k 2 is positive, and obtain the angular distribution of the scattered 
particles belonging to scatterers in state a . It is sufficient to evaluate 
u for the direction 0 — 0 of the pole of the polar coordinates, since 
this direction is arbitrary. We make use of Green’s theorem, which 
states that for any two functions of position A and B the volume 
integral J (A^B— BV*A) cPx taken over any volume equals the 
surface integral J ( AdB/dn—BdAldri)dS taken over the boundary 
of the volume, djdn denoting differentiation along the normal to 
the surface. We take 

A = e -<*rcos0 ? B = < r 0^ a '|l> 

and apply the theorem to a large sphere with the origin as centre. 
The volume integrand is thus 

e -<*rcoB0 v a <r0<£a' |l>-<r0f*' 1 1> V 2 ^ 008 * 

= e” ifcroOB ^(V*+fc 2 )<r0^a / 1 1> = e- <to,008 *F 
from (7) or (11), while the surface integrand is, with the help of (12), 

e -itr oosfli. 1 1>- <rft£a' 1 1> 1 e-** 00 * 8 

dr or . . , 

= ikur-^ 1 + cos 6)e ikrix - co ^ $ 
with neglect of r~ 2 . Hence we get 

ir 

j e -ikroo*9jp ^s x — J (ftf, j r 2 sin0 dd . ikur-^ 1 + cos 0)e < * f < 1 ” 00 * 

J oo 


^ e^kcoB 0 g-*** 00 * 9 
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the volume integral on the left being taken over the whole of space. 
The right-hand side becomes, on being integrated by parts with 
respect to 0, 


| d4 


J e <*rti-«»0>A[ u (i+oo80)] 

0 



The second term in the {} brackets is of the order of magnitude of 
t— 1 , as would be revealed by further partial integrations, and may 
therefore be neglected. We are thus left with 


J* 1-ikrmBjp — — 2 J d^tt(O^a') = — 4im(0^a'), ’ 

o 

giving the value of u(6<!>ol) for the direction 0 = 0. 

This result may be written 

w(Ofx') = — (4w) -1 j e-* p ' TOOB8 l h F d*x, (13) N 


since P' = Jch. If the vector p' denotes the momentum of the scattered 
electrons coming off in a certain direction (and is thus of magnitude 
P'), the value of u for this direction will be 

') = J e-W-x^F d a x, 


as follows from (13) if one takes this direction to be the pole of the 
polar coordinates. This becomes, with the help of (0), 

u(8'<f>W) = jj d»x <xoc'|F|x®a«> 

= -27rmfe<pV|F|pV>, (14) 

when one makes a transformation from the coordinates x to the 
momenta p of the particle, using the transformation function (54) 
of $ 23. The single letter p is here used as a label for the three 
components of momentum. 

.The density of scattered particles belonging to scatterers in state 
a' is now given by |to(0'^V)| a /r*. Since their velocity is P'/m, the 
rate at which these partioles appear per unit solid angle about the 
direction of the vector p' will be P'/m . |u(0^V)|*. The density of 
the inoident particles is, as we have seen, unity, so that the number 
of incident particles crossing unit area per unit time is equal to their 
velocity P°/m, where P° is the magnitude p°. ifence the effectivdj 
area that must be hit by an incident particle in order to be scattereoj 
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jin a unit solid angle about the direction p' and then belong to a 
scatterer in state a' will be 

(>'»>'*'» **) -P'/P 0 . «')|* = 4rrW»P7i». |<pV|F|p°a°>|*. (16) 

This is the scattering coefficient f or transitions a°->a' of the scatterer. 
It depends on that matrix element <p'a'|F|p°a°> of the perturbing 
energy F whose column p°a° and whose row p'a' refer respectively to 
the initial and final states of the unperturbed system, between which 
the scattering transition process takes place. The result (15) is thus 
in some ways analogous to the result (24) of § 44, although the 
numerical coefficients are different in the two cases, corresponding 
to the different natures of the two transition processes. 


50. Solution with the momentum representation 
The result (15) for the scattering coefficient makes a reference only 
to that representation in which the momentum p is diagonal. One 
would thus expect to be able to get a more direct proof of the result 
by working all the time in the p-representation, instead of working 
in the x-representation and transforming at the end to the p-repre- 
sentation, as was done in § 49. This would not at first sight appear 
to he a great improvement, as the lack of directness of the x-repre- 
sentation method is offset by more direct applicability, it being 
possible to picture the square of the modulus of the x-representative 
of a state as the density of a stream of particles in process of being 
scattered. The x-representation method has, however, other more 
serious disadvantages. One of the main applications of the theory 
of collisions is to the case of photons as incident particles. Now a 
photon is not a simple particle but has a polarization. It is evident 
from classical electromagnetic theory that a photon with a definite 
momentum, i.e. one moving in a definite direction with a definite 
frequency, may have a definite state of polarization (linear, circular, 
etc.), while a photon with a definite position, which is to be pictured 
as an electromagnetic disturbance confined to a very small volume, 
cannot have any definite polarization. These facts mean that the 
polarization observable of a photon commutes with its momentum 
but not with its position. This results in the p-representation method 
being immediately applicable to the case of photons, it being only 


and treat it along with the as describing the scatterer, while the 

8595.87 O 
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x-representation method is not applicable. Further, in dealing with 
photons, it is necessary to take relativistic mechanics into account. 
This can easily be done in the p -representation method, but not so 
easily in the x-representation method. 

Equation (3) still holds with relativistic mechanics, but W is now 
given by _ m s c *+P* = m 2 c s +j 3 «+j>*+j>* (16) 


instead of by (2). Written in terms of p-representatives, equation (3) 


{fl'-J^(«0-lF}<p«'|l> = <P«'|F|0>, 


p being written instead of p' for brevity and W being understood as 
^a definite function of p x , p y9 p e given by (16). This may be written 

(TF , -W)<p a '|l> = <p a '|F|0>, (17) 


where W' = E’-H e {<x') (18) 

I and is the energy required by the law of conservation of energy for 
a scattered particle belonging to a scatterer in state a'. The ket |0> 
is represented by (6) in the x-representation and the basic ket |p°a°> 
is represented by 

<x a '|p°a°> = 8 a ' a o<x|p°> = 
from the transformation function (54) of § 23. Hence 

|0> = A*|p°a 0 >, (19) 

and equation (17) may be written 

(JF'-W)<p a '|l> = ^<p a '|F|p°a°>. (20) 

We now make a transformation from the Cartesian coordinates 
Px* Py* P* of p to its polar coordinates P, cj, x* given by 

p x = Poosw, p y = P sin cos x, p e — Psincosinx- 

If in the new representation we take the weight function P*sina>, 
then the weight attached to any volume of p -space will be the same 
as in the previous p-representation, so that the transformation will 
mean simply a relabelling of the rows and columns of the matrices 
without any alteration of the matrix elements. Thus (20) will become 
in the new representation 

(W'-WKP<»x*'\l> = W<Pa»xa / |F|P°c^x°a 0 >, (21) 

TF being now a function of th&single variable P. 
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The coefficient of <Pa>x a'|l>, namely W'—W, is now simply a 
multiplying factor and not a differential operator as it was with the 
x-representation method. We can therefore divide out by this faotor 
and obtain an explicit expression for <Pa>xa'|l>- When, however, a! 
is such that W\ defined by (18), is greater than me 2 , this factor will 
have the value zero for a certain point in the domain of the variable 
P, namely the point P = P', given in terms of W' by (16). The 
function <Pa>x a'|l> will then have a singularity at this point. This 
singularity shows that <Po)x« / |1> represents an infinite number of 
particles moving about at great distances from the scatterers with 
energies indefinitely close to W r and it is therefore this singularity 
that we have to study to get the angular distribution of the particles ^ 
at infinity. 

The result of dividing out (21) by the faotor W'— W is, according 
to (13) of §15, 

<Pa>xa'|l> = h^< K Pa)xoc , \V\P 0 o)°x 0 OL 0 yl(W , ---W)+X{o)X (X, )H^ , ^ m W) i 

( 22 ) 

where A is an arbitrary function of w, x> an d a'. To give a meaning 
to the first term on the right-hand side of (22), we make the conven- 
tion that its integral with respect to P over a range that includes the 
value P' is the limit when e -» 0 of the integral when the small 
domain P'— € to P'+c is excluded from the range of integration. 
This .is sufficient to make the meaning of (22) precise, since we are 
interested effectively only in the integrals of the representatives of 
states when the representation has continuous ranges of rows and 
columns. We see that equation (21) is inadequate to determine the 
representative (Pa)x& completely, on account of the arbitrary 
function A occurring in (22). We must choose this A such that 
<Po>xa'l 1 ) represents only outwaid moving particles, since we want 
the only inward moving particles to be those corresponding to |0>. 

Let us take first the general case when the representative <Pa>xl> 
of a state of the particle satisfies an equation of the type 

(W* —W)(Poix\y =/(P w x)> (23) 

where f(Pwx) is any function of P, a>, and x> and W' is a number 
greater than me 2 , so that <Pa>xl> is of the form t 

<Pa> X |> (24) 

and let us determine now what A must be in order that <Po>x|> may 
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represent only outward moving particles. We can do this by trans- 
forming (P<s>x l> to toe x-representation, or rather the (rO<f>)- repre- 
sentation, and comparing it with (12) for large values of r. The 
transformation function is 

(rB<j>\Pmx> — A-MwW = h~h iPr{COB a>cos0+Blna>sin0 # 

For the direction 6 = 0 we find 


<r0^|> = h~* j P 2 dP J dx j sin w dw e <Pr0OBa, ^<Pa>xl> 

o oo 

oo 2 tt 

r r ( fpiPr cob u)l ft To>=rr 

“ sh / p,jp /H-[W < p “* I> L + 

0 

The second term in the { } brackets is of order r~ 2 , as may be verified 
by further partial integrations with respect to w, and can therefore 
be neglected. We are left with 

OO 2n 

<rty |> = ife-‘(27rr)-i J PdPj d x {e- <iv ' s <P77 X | >- e^#<PO x | >} 

0 0 

oo 

= ih-h- 1 J P dP {e- iPr l*(PTr X I > — e iPr l h (PO x | >}. (25) 

o 

When we substitute for <,Pto X l> its value given by (24), the first 
term in the integrand in (25) gives 


oo 

iA-tr- 1 j PdP e-**l*{f(PiTx)l(W'-W)+\(TT X )Z(W'-W)}. (26) 
x 0 

The term involving 8(TF' — W) here may be integrated immediately 
and gives, when one uses the relation P dP = W dW/c*, which 
follows from (16), 

00 

a-*c-V-i f WdW e-tWXinxMW'- W) 

= ih- *c-*r-W\(n X )e-* p ’ r l\ (27) 

To integrate the other term in (26) we use the formula 



g-iPr/A 

F=p 



e -iPrlA 

P’-P 



( 28 ) 
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with neglect of terms involving r -1 , for any continuous function g(P), 

00 * 


which formula holds sinoe 


i 


K(P)e~ iPr l n 4P is of order r -1 for any 


continuous function K(P) and since the difference 


g(P)l(P'-P)-g(P')l(P'-P) 

is continuous. The right-hand side of (28), when evaluated with 
neglect of terms involving r~ l , and also with neglect of the small 
domain P '— e to P'+e in the domain of integration, gives 


r p-iPrlh r pHF-Pyrlh 

a(p,) J Trzp dp = aine-^ J -przp dP 


= ig{P')e-*r'l* J aln = iTTg(P')e- ip, nt. (2») 

—00 

In our present example g(P) is 

g(P) = ih-b-iPf(Pir X )(P'-P)l(W'-W) t 
which has the limiting value when P = P', 

g(P') = ih-h-'P'ftP’T'xW'IP'c* = ih-h-h-'W'f{P’'ir X ). 

Substituting this in (29) and adding on the expression (27), we obtain 
the following value for the integral (26) 

fe-i C -V'il7 / {-7r/(P / 7r X )+iA(7r X )}e--^. (30) 

Similarly the second term in the integrand in (26) gives 

(31) 

The sum of these two expressions is the value of <r0^ |> when r is 
large. 

We require that <r0<£|> shall represent only outward moving 
particles, and hence it must be of the form of a multiple of e ip ' f ^. 
Thus (30) must vanish, so that 

A(ttx) = -tV/(PV X ). (32) 

We see in this way that the condition that (r8<f > |) shall represent 
only outward moving particles in the direction 0 = 0 fixes the value 
of A for the opposite direction 0 = w. Since the direction 0 = 0 or 
w — 0 of the pole of our polar coordinates is not in any way singular, 
we can generalize (32) to 

*("X) = — 


(33) 
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which gives the value of A for an arbitrary direction. This value 
substituted in (24) gives a result that may be written 

<Pco* l> =/(i > a> X ){l/(^ / -lf)-^S(Tf / -IF)}, (34) 

since one can substitute P' for P in the coefficient of a term involving 
h(W'—W) as a factor without changing the value of the term. The 
condition that <Po>vl> shall represent only out ward moving particles U l 
thus that it shall contain the facto r 

(1 nw*- W)-jnt(W'-W)h (35) 

It is interesting to note that this factor is of the form of the right- 
hand side of equation (15) of .§ 15. 

With A given by (33), expression (30) vanishes and the value of 
<r0^|> for large r is given by expression (31) alone, thus 

<r0^|> = -27rA-*c- 2 r- 1 TT7(P'0x)e <p/ ^. 

This may be generalized to 

(rd<f>\y = - 277^-^c- V- 1 W'f(P'wx)e ip, 'l*, « 

giving the value of <r0<£|> for any direction 0, <f> in terms of f(P'a>x) 
for the same direction labelled by to, x- This is of the form (12) with 

u(6<l>) = — 27rA“*c“ a W f f(P f <ox) 

and thus represents a distribution of outward moving particles of 
momentum P' whose number is 


c*P' 

W' 


N a = 


47r*W'P' 
he 8 


\f(P'o>x)\ 2 


(36) 


per unit solid angle per unit time. This distribution is the one 
represented by the <Po>xl> of (34). . 

From this general result we can infer that, whenever we have a 
representative <Pwxl> representing only outward moving particles 
and satisfying an equation of the type (23), the number per unit solid 
angle per unit time of these particles is given by (36). If this <Po>x|> 
occurs in a problem in which the number of incident particles is one 
per unit volume, it will correspond to a scattering coefficient of 


amount 


4n*W°W'P' 

hc*P° 


i/(^x)r 


(37) 


It is only the value of the function f(Po>x) for the point P = P' that 
is of importance. 
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If we now apply this general theory to our equations (21) and 
(22), we have 

f(Pojx) = h*(Pwxoi f | V | P°Qj 0 x°a 0 ). 

Hence from (37) the scattering coefficien t is 

Wh'WOW'P'IciP 0 . |<P'a)xa'|F|P°a)Va 0 >l a - (38) 

If one neglects relativity and puts W°W'lc* = m a , this result reduces {' 
to the result (15) obtained in the preceding section by means of ] 
Green’s theorem. 

51. Dispersive scattering 

We shall now determine the scattering when the incident particle 
is capable of being absorbed, that is, when our unperturbed system 
of ssatterer plus particle has closed stationary states with the particle 
absorbed. The existence of these closed states for the unperturbed 
system will be found to have a considerable effect on the scattering 
for the perturbed system, and indeed an effect that depends very 
much on the energy of the incident particle, giving rise to the pheno- 
menon of dispersion in optics when the incident particle is taken to 
be a photon. 

We use a representation for which the basic kets correspond to 
the stationary states of the unperturbed system, as was the case with 
the p-representation of the preceding section. We take these station- 
ary Btates to be the states (pV) for which the particle has a definite 
momentum p' and the scatterer is in a definite state a', together with 
the closed states, 1c say, which form a separate discrete set, and 
assume that these states are all independent and orthogonal. This 
assumption is not accurate when the particle is an electron or atomic 
nucleus, since in this case for an absorbed state k the particle will 
still certainly be somewhere, so that one would expect to be able to 
expand |fc> in terms of the eigenkets |x'a'> of x, y, z , and the a’s, 
and hence also in terms of the | pV>’s. On the other hand, when the 
particle is a photon it will no longer exist for the absorbed states, 
which are then certainly independent of and orthogonal to the states 
(pV) for which the particle does exist. Thus the assumption is valid 
in this case, which is an important practical one. 

Since we are concerned with scattering, we must still deal with 
stationary states of the whole system. We shall now, however, have 
to work to the second order of accuracy, so that we cannot use merely 
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the first-order equation (3), but must use also (4). Equation (3) 
becomes, when written in terms of representatives in our present 
representation, 

(TT'— PT)<pa'|l> = <pa'|F|0>, 

(E'-E k Klc\l> = <&|F|0>, 

where W' is the function of E' and the a”s given by (18) and E k is the 
energy of the stationary state k of the unperturbed system. Similarly, 
equation (4) becomes 

(Tf'-TF)<p a '|2> = <p a '|F|l>, 

(E'—E k )(k\2y — <fc|F|l>. 

Expanding the right-hand sides by matrix multiplication, we get 
(jy'_jF) <pa '|2> ^ 

=?/<>“' 

{E’-E h xm 

= |J <&|F|pV> <Pp' <pV|l>+ 1 <fc|F|F><r|l>. J 

The ket |0> is still given by (19), so (39) may be written 

(17'_lF)<p a '|l> = A i <pa'|7|p0ot 0 >, (42) 

(E'-E k Kk\l> = hKh |F|pV>. (43) 

We may assume that the matrix elements <&'|F|&*> of F vanish, 

' since these matrix elements are not essential to the phenomena under 
investigation, and if they did not vanish it would mean simply that 
the absorbed states k had not been suitably chosen. We shall further 
assume that the matrix elements <pV | F |p V> are of the second order 
of smallness when the matrix elements <Jc’ |F|pV>, <p'a'|F|&*> are 
taken to be of the first order of smallness. This assumption will be 
justified for the case of photons in § 64. We now have from (43) and 
(42) that <&|1> is of the first order of smallness, provided E' does not 
lie near one of the discrete set of energy -levels E ki and <pa'|l> is of 
the second order. The value of <pa'|2> to the second order will thus 
be given, from the first of equations (41), by ^ 

(W'-IF)<p a '|2> = M J <P«'|F|^><^|F|p 0 a 0 >/(^--^). 


|F|pV> <Pp' <pV|l>+ | <p a '|F|F><^|l>, 


(41) 
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The total correction in the wave function to the second order, namely 
<p«'|l> plus <pa'|2>, therefore satisfies 

(F , -Tr ) {<pa , |l>+<pat'|2» 

- A*{<P“'|F|p°a 0 >+ 1 <p^|F|4><*|F|pV>/(^-^)}. 

This equation is of the type (23), provided a' is such that W f > mc a , 
which means that a as a final state for the scatterer is not incon- 
sistent with the law of conservation of energy. We can therefore infer 
from the general result (37) that the scattering coefficient is 

t!5^|<pV|F|p.^ + |<2^Wl^>|’ (44, 

The scattering may now be considered as composed of two parts, 
a part that arises from the matrix element <p'a'|F|p°a°> of the per- 
turbing energy and a part that arises from the matrix elements 
<p'a'|F| ky and <&|F|p°a°>. The first part, which is the same as our 
previously obtained result (38), may be called the direct scattering. 
The second part may be considered as arising from an absorption of 
i t he inci dent particle into some state k, followed immediately byja, 

1 re-emission in a different direction, and is like the transitions through 
an intermediate state considered in § 44. The fact that we have to 
add the two terms before taking the square of the modulus denotes 
interference between the two kinds of scattering. There is no experi- 
mental way of separating the two kinds, the distinction between 
them being only mathematical. 

52. Resonance scattering 

Suppose the energy of the incident particle to be varied con- 
tinuously while the initial state a 0 of the scatterer is kept fixed, so 
that the total energy E' or H ' varies continuously. The formula (44) 
now shows that as i W approaches one of the discrete set of energy- 
levels E ki the scattering becomes very large. In fact, according to 
formula (44) the scattering should be infinite when W is exactly equal 
to an E k . An infinit e scattering coefficient is, of course, physically 
impossible, so that we can infer that the approximations used in 
deriving (44) are no longer legitimate when E' is dose to an E k . To 
investigate the scattering in this cJase we must therefore go baok to 
the exact equation ^ = F|fl # >, 

equation (2) of § 43 with W written for E\ and use a different method 
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of approximating to its solution. This exact equation, written in 
terms of representatives like (41), becomes 

(W'-WKpoc'\H'> 

- J J <P«'|F|pV> d»p* <pV|H'>+ | <pa'|F|F><F|.ff'>, 
{E'-E k )<Jc\H l > 

= | / <fc|F|pV> <Pp" <pV| H'>+ £ <fc|F|ft'><F|£r>. 

Let us take one particular E k and consider the case when E' is close 
to it. The large term in the scattering coefficient (44) now arises from 
those elements of the matrix representing F that lie in row ft or in 
column ft, i.e. those of the type <ft|F|pa'> or <pa'|F|ft>. The scatter- 
ing arising from the other matrix elements of F is of a smaller order 
of magnitude. This suggests that in our exact equations (46) we should 1 
make the approximation of neglecting all the matrix elements of F 
except the important ones, which are those of the type <pa'|F|ft> or \ 
<ft|F|pa'>, where a' is a state of the scatterer that has not too much i 
energy to be disallowed as a final state by the law of conservation of J 
energy. These equations then reduce to 

W-W)<pct\H’> = <p a '|F|ft><ft|tf'>, 

(E'-E k Kk\H’> = | J <&|7|pa'> d*p <poi'|fl'>, 

the a' summation being over those values of a! for which W given 
by (18) is > me 2 . These equations are now sufficiently simple for us 
to be able to solve exaotly without further approximation. 

From the first of equations (40) we obtain by division 

<P«'I H'y « <pa' | F |ft> <fc |2T>/( W’ —W)+\h(W' —W). (47) 

We must choose A, which may be afiy function of the momentum 
p and a', such that (47) represents the inoident particles corresponding 
to |0> or A*|p°a°> together with only outward moving particles. [The 
representative of A*|p°a°> is actually of the form A8(TF'-— W) 9 since 
the conditions ot! = a° and p = p° for it not to vanish lead to 
W = E'-H a (<x’) = E'-H a (a<>) = TF°= W.] Thus (47) must be 
<P*’\H'> - A*<P«'|p 0 «°>+ 

+ <P«' l F |*> <ft | J3 r/ >{l/( TF' — TF) — iTr S( FT' — TF)>, (48) 
and from the general formula (37) the scattering coefficient will be 
*n*W*W'P'IJu*Po. |<pV|F|ft>| a |<ft|H'>|*. (49) 
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It remains for us to determine the value of (Jk | #'>. We can do this 
by substituting for <pa'|£T> in the second of equations (46) its value 
given by (48). This gives 

(E'-E k Kk\H’> = A*<fe]7|p°a°>+ 

+<Jc\H'y | J |<£|F|pcO| a {l/( JT'- W)-iiT S(IF'- IF)} d 8 p 

where a = J |<&|F|pa'>| 2 d 8 p/(IF'— W) (60) 

and b = nj i j |<*|F|pai'>|*8(IF'— IF)d s p 

= "?/// KW\P<»X*'>\ 2 l{W'-W)P*dPBmu>du>dx 

= ^ P'lF'c -2 J J |<fc|F|P'coxai'>| 2 sinct> dcodx- (51) 

Thus <k\H'y = P<fc|F|p°a°>/(£'-P i .-a+i&). (62) 


Note that a and b are real and that b is positive. 

This value for <&|IP> substituted in (49) gives for the scattering 
coefficient 

4*r*hW*W'P' |<pV|F|&>| 2 |<*|F|pW>P 

c 4 P° (E'—-E k —a) 2 +b 2 • 1 } 

One can obtain the total effective area that the incident particle 
must hit in order to be scattered anywhere by integrating (63) over 
all directions of scattering, i.e. by integrating over all directions of 
the vector p' with its magnitude kept fixed at P', and then summing 
over all a' that are to be taken into consideration, i.e. for which 
W' > me 2 . This gives, with the help of (61), the result 


4t rh*W 0 &|<&|F|p°a°>| 2 
e 2 P° (E'-E k -a)*+b*' 


If we suppose E f to vary continuously through the value E k , the 
main variation of (63) or (64) will be due to the small denominator 
(E'— E k — a) 2 +6 2 . If we neglect the dependence of the other factors 
in (53) and (54) on E', then the maximum scattering will occur when 
E' has the value E k +a and the scattering will be half its maximum 
when E differs from this value by an amount 6. The large amount of 
scattering that occurs for values of the energy of the incident particle 
that make W nearly equal to E k give rise to the phenomenon of an 
absorption line. The centre of the line is displaced by an amount 
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a from the resonance energy of the incident particle, i.e. the energy 
which would make the toted* energy just E ki while the quantity b is 
what is sometimes called the half-width* of the line. 

53. Emission and absorption 

For studying emission and absorption we must consider non- 
stationary states of the system and must use the perturbation method 
of § 44. To determine the coefficient of spontaneous emission we must 
take an initial state for which the particle is absorbed, corresponding 
to a ket |Jfc>, and determine the probability that at some later time, 
the particle shall be on its way to infinity with a definite momentum. 
The method of § 46 can now be applied. From the result (39) of that 
section we see that the probability per unit time per unit range of c o 
and x> of the particle being emitted in any direction u>\ x with the 
scatterer being left in state a ' is 

27tS _1 1 < W'o) 'x'a' | F | &) | 2 , (55) 

provided, of course, that a is such that the energy W\ given by ( 18 ), 
of the particle is greater than me 2 . For values of a' that do not satisfy 
this condition there is no emission possible. The matrix element 
<JP'a/x'a , |F|&> here must refer to a representation in which W, eo, x> 
and a are diagonal with the weight function unity. The matrix 
elements of V appearing in the three preceding sections refer to a repre- 
sentation in which p x , p yi p e are diagonal with the weight function 
unify, or P, w, x are diagonal with the weight function P*sina>. 
They would thus refer to a representation in which W, w, x are 
diagonal with the weight function dP/dW.P 2 sino> = WP/c*. sin to. 
Thus the matrix element <lF'a/xV|F|i> in (55) is equal to 
(fF'P'/^.sinci/)* times our psevious matrix element <JF'a/xV|F|&> 
or <pV|F|ifc>, so that (65) is equal to 

£E£wi<,,viP|i>i'. 

The probability of emission per unit solid angle per unit time, with 
the scatterer simultaneously dropping to state a', is thus 

( 66 ) 

To obtain the total probability per unit time of the particle being 
omitted in any direction, with any final state for the scatterer, we 
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must integrate (56) over all angles an d sum over all states at 
whose energy H 8 (ql) is suclj that H 8 (ot')+mc* < E k . The result is 
just 2b/ ft, where b is defined by (51). There is thus this simple rda - 1 
tion between the total emission coefficient and the half -width b of the \ 
absorption line. :: t * b ** * *$■ ^ 

Let us now consider absorption. This requires that we shall take 
an initial state for which the particle is certainly not absorbed but is 
incident with a definite momentum. Thus the ket corresponding to 
the initial state must be of the form (19). We must now determine 
the probability of the particle being absorbed after time t. Since oiir 
final state k is not one of a continuous range, we cannot use directly 
the result (39) of § 46. If, however, we take 

|0> = |pW>, (57) 

as the ket corresponding to the initial state, the analysis of §§ 44 and 46 
is still applicable as far as equation (36) and shows us that the proba- 
bility of the particle being absorbed into state k after time t is 

2|<*|F|pV>rci— COB {(E k -E'm]l(E k -E')*. 

This corresponds to a distribution of incident particles of density 
A -8 , owing to the omission of the factor h* from (57), as compared 
with (19). The probability of there being an absorption after time 
t when there is one incident particle crossing unit area per unit time 
is therefore 

2 h*W°/c*P°. | <& | F | p°a°> | 2 [1 — cos{(E k —E')t/ft}]l(E k — E') 2 . (58) 

To obtain the absorption coefficient we must consider the incident 
particles not all to have exactly the same energy W° = E'—H 8 (ol°), 
but to have a distribution of energy values about the correct value 
E k —H 8 (<x°) required for absorption. If we take a beam of incident 
particles consisting of one crossing unit area per unit time per unit 
energy range, the probability of there being an absorption after time 
t will be given by the integral of (58) with respect to E'. This integral 
may be evaluated in the same way as (37) of § 46 and is equal to 

4t T*h*WH/c 2 P°. |<&|F|pW>|*. 

The probability per unit time of an absorption taking place with an 
incident beam of one particle per unit area per unit time per unit 
energy range is therefore 

47t*A 2 W°/c*P° . | <£ | F | p°a°> | a , (59) 

whioh is the absorption coefficient. 
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The connexion between the absorption and emission coefficients 
(59) and (56) and the resonance scattering coefficients calculated in 
the prece ding section should be noted. When the incident beam does 
not consist of particles all with the same energy, but consists of a unit 
distribution of particles per unit energy range crossing unit area per 
unit time, the total number of incident particles with energies near 
an absorption line that get scattered will be given by the integral 
of (54) with respect to E'. If one neglects the dependence of the 
numerator of (54) on E\ this integral will, since 

oo 

r - dE‘ = it, 

J (E'-E k -a)*+b* 

— OO 

have just the value (59). Thus the total number of scattered particles 
in the neighbourhood of an absorption line is equal to the total number 
absorbed. We can therefore regard all these scattered particles as 
absorbed particles that are subsequently re-emitted in a different 
direction. Further, the number of particles in the neighbourhood of 
the absorption line that get scattered per unit solid angle about a 
given direction specified by p' and then belong to scatterers in state 
ol will be given by the integral with respect to E' of (53), which 
integral has in the same way the value 

WWW*W p ' n KpV 1 7|&> |*| <& | F | p°a°> | *. 

This is just equal to the absorption coefficient (59) multiplied by the 
emission coefficient (56) divided by 2 b/h, the total emission coefficient. 
This is in agreement with the point of view of regarding the resonance 
scattered particles as those that are absorbed and then re-emitted, 
the absorption and emission processes governed independently 
each by its own probability law, since this point of view would 
make the fraction of the total number of absorbed particles that are 
te-emitted in a unit solid angle about a given defection just the 
emission coefficient for this direction divided by the total emission 
coefficient. 
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54. Symmetrical and antisymmetrical states 
If a system in atomic physics contains a number of partioles of the 
same kind, e.g. a number of electrons, the particles are absolutely 
indistinguishable one from another. No observable change is made 
when two of them are interchanged. This circumstance gives rise to 
some curious phenomena in quantum mechanics having no analogue 
in the classical theory, which arise from the fact that in quantum 
mechanics a transition may occur resulting in merely the interchange 
of two similar particles, which transition then could not be detected 
by any observational means. A satisfactory theory ought, of course, 
to count two observationally indistinguishable states as the same 
state and to deny that any transition does occur when two similar 
particles exchange places. We shall find that it is possible to reformu- 
late the theory so that this is so. 

Suppose we have a system containing n similar particles. We may 
take as our dynamical variables a set of variables describing the 
first particle, the corresponding set f 2 describing the second particle, 
and so on up to the set £ n describing the nth particle. We shall then 
have the f r ’s commuting with the £ a ’s for r s. (We may require 
certain extra variables, describing what the system consists of in 
addition to the n similar particles, but it is not necessary to mefition 
these explicitly in the present chapter.) The Hamiltonian describ£f% 
the motion of the system will now be expressible as a function of the 
£i> fa>— > The fact that the particles are similar requires that ik § 

Hamiltonian shall he a symmetrical function of the ( v f n , i.e. it 
shall remain unchanged when the sets of variables are interc hang ed 
or permuted in ^ay way. This condition must hold, no matter what^ 
perturbations are applied to the system. In fact, any q uant ity of 
physical significance must be a symmetrical function of the £'s. 

Let la^, !&!>,... be kets for the first particle considered as a dynami- 
cal system by itself. There will be corresponding kets [a 8 >, for 
the second particle by itself, and so on. We can get a ket for the 
assembly by taking the product of kets for each particle by itself, 
for example * . 




( 1 ) 
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say, according to the notation of (65) of § 20. The ket (1) corresponds 
to a special land of state (or the assembly, which may be described 
by saying that each partifle is in its own state, corresponding to its 
own factor on the left-hand side of (1). The general ket for the 
assembly is of the form of a sum or integral of kets like (1), and 
oorresporifc to a state for the assembly for which one cannot say that 
each panicle is in its own state, but only that each particle is partly 
in several states, in a way which is correlated with the other particles 
being partly in several states. If the kets 1%), !&!>,... are a set of 
basic kets for the first particle by itself, the kets |a 2 >, |6 2 >,... be 
Et set of basic kets for the second particle by itself, and so on, and the 
kets (1) will be a set of basic kets for the assembly. We call the repre- 
sentation provided by such basic kets for the assembly a symmetrical 
representation , as it treats all the particles on the same footing. 

In (1) we may interchange the kets for the first two particles and 
get another ket for the assembly, namely 

l^l)l a a)l c 8)*”lfl r n) ^ a 2 c s—ffn}* 

More generally, we may interchange the role of the first two particles 
in any ket for the assembly and get another ket for the assembly. 
The process of interchanging the first two particles is an operator 
which can be applied to kets for the assembly, and is evidently a 
linear operator, of the type dealt with in § 7. Similarly, the process 
of interchanging any pair of particles is a linear operator, and by 
repeated applications of suoh interchanges we get any permutation 
of the particles appearing as a linear operator which can be applied 
to kets for the assembly. A permutation is called an even permutation 
or an odd permutation according to whether it can be built up from 
pn even or an odd number of interchanges. 

A ket for the assembly \X} is called symmetrical if it is unchanged 
by pny permutation, i.e. if 

P|X>=|X> „ (2) 


for any permutation P. It is called antisymmetrical if it is unchanged 
by any even permutation and has its sign changed by any odd 
permutation, i.e. if P|X> = ±|X>, (3) 

tike + or — sign being taken according to whether P is even or odd. 
The state corresponding to a symmetrical ket is called a symmetrical 
state, and the state corresponding to an antisymmetrical ket is called 


a j\<mtisymmetrical state. In a symmetrical representation, the repre- 
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sentdtive of a symmetrical ket is a symmetrical function of the 
variables referring to the various, particle and the representative of 
an antisymmetrical ket is an antisymmerakcal function. 

In the Schrodinger picture, the ket corresponding to a state of the 
assembly will vary with time according to Schrodinger’s equation of 
motion. If it is initially symmetrical it must always remain sym- 
metrical, since, owing to the Hamiltonian being symmetrical, there 
is nothing to disturb the symmetry. Similarly if the ket is initially 
antisymmetrical it must always remain antisymmetrical. Thus a 
j state which is initially symmetrical always remains symmetrical and 
a state which is initially antisymmetrical always remains antisym- 
metrical. In consequence, it may be that for a particular kind of 
particle only symmetrical states occur in nature, or only anti- 
symmetribal states occur in nature. If either of these possibilities 
held, it would lead to certain special phenomena for the particles in 
question. 

Let us suppose first that only antisymmetrical states ooour in 
nature. The ket (1) is not antisymmetrical and so does not corre- 
spond to a state occurring in nature. From ( 1 ) we can in general form 
an antisymmetrical ket by applying all possible permutations to it 
and adding the results, with the coefficient — 1 inserted before those 
terms arising from an odd permutation, so as to get 

£ ±PM t C,...0n>, ( 4 ) 

the + or — sign being taken according to whether P is even or odd. 
The ket (4) may be written as a determinant 


l«l> 

i°2> 1°*) • 

. • K> 

IV 

IV .IV • 

• • IV 

IV 

IV IV • 

• • IV 

IV 

IV IV • 

• • IV 


and its representative in a symmetrical representation is a determi- 
nant The ket (4) or (5) is not the general antisymmetrical kpt, but 
is a specially simple one. It corresponds to a state for jbhe assembly 
for which one can say that certain particle-states, namely the states 
a,6,c,...,0, are^occupied, but one cannot say which particle is in 
which state, each particle being equally likely to be in any state. If 

8895.57 
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two of the particle-states a,b,c,...,g are the, same, the ket (4) or (5) 
vanishes and does not correspond to any state for the assembly. 
Thus two particles cannot decupy the same state . More generally, the 
occupied states must be aU independent, otherwise (4) or (5) vanishes. 
This is an important characteristic of partioles for whioh only anti- 
symmetrioal states occur in nature. It leads to a special statistics, 
which was first studied by Fermi, so we shall call particles for which 
only antisymmetricalStates occur in nature fermions. 

Let us suppose now th$t only symmetrical states occur in nature. 
The ket (1) is not symmetrical, except in the special case when all the 
particle-states a,b,c t ...pg are the same, but we can always obtain a 
symmetrical ket from it by applying all possible permutations to it 
and adding the results, so as to get 

^ P\a 1 b i c t ...g n '). (6) 

The ket (6) is not the general symmetrical ket, but is a specially 
simple one. It corresponds to a state for the assembly for which one 
can say that certain particle-states are occupied, namely the states 
a, b , c,..., g, without being able to say which particle is in which state. 
It is now possible for two or more of the states a y b,c,...,g to be the 
same, so that two or more particles can be in the same state. In spite 
of this, the statistics of the particles is not the same as the usual 
statistics of the classical theory. The new statistics was first studied 
by Bose, so we shall call partioles for which only symmetrical states 
occur, in nature bosons. $ ^ 

We can see the difference of Bose statistics from the usual statistics 
by considering a special fase — that of only two particles and only two 
independent states a and b for a particle. According to classical 
mechanics, if the assembly of two . particles is in thermodynamic 
equilibrium at a high temperature, each particle will be equally likely 
to be in either state. There is thus a probability £ of both particles 
being in state a, a«probability £ of both partioles being in state 6, 
and a probability £ of one particle being in each state. In the quan- 
tum theory , th^re are three independent symmetrical* state^for the 
pair of partioles, corresponding to the symmetrical kete |Oi>|a a >, 
i&i>t&aX and. desoilbable as both particles in 

state a, both particles in state b, and one particle in each state 
respectively. Fortkermodyifetmio equilibrium at a high temperature 
these three states are Squ^Up probable, as was shown in £ 33, so that 
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there is a probability J of both particles being in state a, a probability 
| of both particles being in state 6, and a probability J of one particle 
being in each state. Thus with Bose statistics the probability of two 
particles being in the same state is greater than with classical statistics . 
Bose statistics differ from classical statistics in the opposite direction 
to Fermi statistics, for which the probability of two particles being 
in the same state is zero. 

In building up a theory of atoms on the lines mentioned at the 
beginning of § 38, to get agreement with experiment one must assume 
that two electrons are never in the same state. This rule is known as 
Pauli’s exclusion principle. It shows us that electrons are fermions. 
Planck’s law of radiation shows us that photons are bosons , as only the 
Bose statistics for photons will lead to Planck’s law. Similarly, for 
each of the other kinds of particle known in physics, there is experi- 
mental evidence to show either that they are fermions, or that they 
are bosons. Protons, neutrons, positrons are fermions, a-particles are 
bosons. It appears that all particles occurring in nature are either 
fermions or bosons, and thus onhp antisymmetrical or symmetrical 
states for an assembly of similar particles are met with in practice. 
Other more complicated kinds of symmetry are possible mathemati- 
cally, but do not apply to any known particles. With a theory which 
allows only antisymmetrical or only symmetrical states for a particu- 
lar kind of particle, one cannot make a distinction between two states 
which differ only through a permutation of the particles, so that the 
transitions mentioned at the beginning of this section disappear. 

55. Permutations as dynamical variables 

We shall now build up a general theory for a system containing n 
similar partioles when states with any kind of symmetry properties 
; are allowed, i.e. when there is no restriction to only symmetrical or 
only antisymmetrical states. The general state now will not be sym- 
metrical or antisymmetrical, nor will it be expressible linearly in 
terms of symmetrical and antisymmetrical states when n > 2. This 
theory ^fill hot apply directly to any particles occurring in nature, 
but all the same it is useful for setting up an approximate treatment 
for an assembly of electrons, as will be. shown in § 58. 

We have seen that each permutation P of the n partioles is a linear 
operator which can be applied to any at for the assembly. Henoe 
we oan regard P as a dynamical variabl^u^oihr system of n particles. 
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There are nl permutations, each of which can be regarded as a 
dynamical variable. One of them, say, is the identical permutation, 
which is equal to unity. The product of any two permutations is a 
third permutation and hence any function of the permutations is 
reducible to a linear function of them. Any permutation P has a 
reciprocal P _1 satisfying 

PP- 1 = P-ip = P 2 = 1. 

A permutation P can be applied to a bra <X| for the assembly, 
to give another bra, which we shall denote for the present by P<X|. 
If P is applied to both factors of the product <X| Y), the product 
must be unchanged, since it is just a number, independent of any 
^Order of the particles. Thus 

(p<zi)P|y> = <z|y> 

showing that P<X| = <Z| P- 1 (7) 

Now P<X | is the conjugate imaginary of P|X> and is thus equal to 
<X|P, and hence from (7) _ 

P = P" 1 . (8) 

Thus a permutation is not in general a real dynamical variable, its 
conjugate complex being equal to its reciprocal. 

Any permutation of the numbers 1, 2, 3,..., n may be expressed in 
the cyclic notation, e.g. with n = 8 

P a = (143)(27)(58)(0), (9) 

in which each number is to be replaced by the succeeding number in 
a bracket, unless it is the last In a bracket, when it is to be replaced 
by the first in that braoket. Thus P a changes the numbers 12345678 
into 47138625. The type of any permutation is specified by the 
partition of the number n which is provided by the number of num- 
ber^ in each of the brackets. Thus the type of P a is specified by the 
partition 8 = 3+2+2+ 1. Permutations of the same type, i.e. corre- 
sponding to the same partition, we shall call similar . Thus, for 
example, P a in (9) is similar to 

P b = (871)(35)(46)(2). (10) 

The whole of the nl possible permutations may be divided into sets 
of s i milar permutations, each suoh set beSftg called a dass. The per- 
mutation Pj=sl forms a class by itself. Any pernffetation is flimi|a.r 
to Hi reoiprW, 
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When two permutations P 0 and P b are similar, either of them P b 
may be obtained by making a certain permutation P x in the other 
P a . Thus, in our example (9), (10) we can take P x to be the permuta- 
tion that changes 14327586 into 87135462, i.e. the permutation 
P x = (18623)(475). 

Different ways of writing P a and P b in the cyclic notation would lead 
to different P x s. Any of these P x ’s applied to the produot P a |X> 
would change it into P b .P x \X), i.e. 

P x P a \X> = P b P x \X>. 

Hence P b = P x P a P x \ (H) 

which expresses the condition for P a and P b to be similar a& an 
algebraic equation. The existence of any P x satisfying (11) is suffi- 
cient to show that P a and P b are similar. 

56. Permutations as constants of the motion 

Any symmetrical function V of the dynamical variables of all the 
particles is unchanged by the application of any permutation P, so 
P applied to the product V\X > affects only the factor \X}, thus 
PV |X> = VP \X). 

Hence PV = VP, (12) 

showing that a symmetrical function of the dynamical variables com- 
mutes with every permutation. The Hamiltonian is a symmetrical 
function of the dynamical variables and thus commutes with every 
permutation. It follows that each j permutation is a constant of the 
motion . This holds even if the Hamiltonian is not constant. If \Xt} 
is any solution of Schrodinger’s equation of motion, P\Xt} is another. 

In dealing with any system in quantum mechanics, when we have 
found a constant of the motion a, we know that if for any state of 
motion, a initially has the numerical value a', then it always has this 
value, so that we can assign different numbers a to the different 
states and so obtain a classification of the states. The procedure is 
not so straightforward, however, when we have several constants of 
the motion oc which do not commute (as is the case with our permuta- 
tions P), since we cannot in general assign numerical values for all 
the a’s simultaneously to any state. Let us first take the oase of a 
system whose Hamiltoni^l does not involve the time explicitly. The 
existence of constants of the motion x which do not oommute is 
then a sign that the system is degenerate. This is because, for a 
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non-degenerate system, the Hamiltonian H by itself forms a complete 
set of commuting observables and hence, from Theorem 2 of § 19, each 
of the a’s is a function of H and therefore commutes with any other a. 

We must now look for a function of the a’s which has one and 
the same numerical value ft' for all those states belonging to one 
energy-level H', so that we can use for classifying the energy-levels 
of the system. We can express the condition for /? by saying that it 
mupt be a function of H and must therefore commute with every 
dynamical variable that commutes with H, i.e. with every constant 
of the motion. If the a’s are the only constants of the motion, or if 
they are a set that commute with all other independent constants of 
the motion, our problem reduces to finding a function ft of the a’s 
-Which commutes with all the a’s. We can then assign a numerical 
value /}' for j8 to each energy-level of the system. If we can find 
several such functions {$, they must all commute with each other, so 
that we can give them all numerical values simultaneously. We ob- 
tain thus a classification of the energy-levels. When the Hamiltonian 
involves the time explicitly one cannot talk about energy-levels, but 
the j8’s will still give a useful classification of the states. 

We follow this method in dealing with our permutations P. We 
must find a function x of the P’s such that P^P" 1 = x for ©very P. 
It is evident that a possible x is 2 the sum of all the permutations 
in a certain class c, i.e. the sum of a set of similar permutations, since 
2 PPq P -1 must consist of the same permutations summed in a differ- 
ent order. There will be one such x f° r ©ach class. Further, there can. 
be no other independent since an arbitrary function of the P’s can 
be expressed as a linear function of them with numerical coefficients, 
and it will not then commute with every P unless the coefficients of 
similar P’s are always the same. We thus obtain all the x’s that can 
be used for classifying the states. It is convenient to define each x as 
an Average instead of a sum, thus 

,* Xe = n c 1 2 Po> 

where n c iB the number of P’s in the class c. An alternative expression 
***“ * Xc = nl- 1 ^PP 0 P- 1 , (IS) 

the sum being extended over all the nl permutations P, it being easy 
to verify that this- sum contains eaoh member of the class c the same 
number of times. For each permutation P there is one x» x(P) say, 
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equal to the average of all permutations ftimiUr to P. One of the 
X >B “ x(-Pi) = 1- 

The constants of the motion xi> Xa>— » Xm obtained in this way will 
each have a definite numerical value for every stationary state of the 
By stem, in the case when the Hamiltonian does not involve the time 
explicitly, and also in the general case can be used for classifying 
the states, there being one set of states for every permissible set of 
numerical values Xi>X*—>Xm f° r the x’s* Since the x’s are always 
constants of the motion, these sets of states will be exclusive, i.e. 
transitions will never take place from a state in one set to a state in 
another. 

The permissible sets of values x that one can give to the x’s are 
limited by the fact that there exist algebraic relations between the 
x’s. The product of any two x% Xp Xq> is of course expressible as 
a linear function of the P’s, and since it commutes with every P it 
must be expressible as a linear function of the x’s, thus 

XvX* = a lXl+ a iXi+-+ a mXm> ( 14 ) 

where the a’s are numbers. Any numerical values x that one gives 
to the x s must be eigenvalues of the x’s and must satisfy these same 
algebraic equations. For every solution x of these equations there 
is one exclusive set of states. One solution is evidently Xp = 1 for 
every x P > giving the set of symmetrical states. A second obvious 
solution, giving the set of antisymmetrical states, is Xp = ±1> the 
+ or — sign being taken according to whether the permutations in 
'the class p are even or odd. The other solutions may be worked out 
in any special case by ordindly algebraic methods, as the coefficients 
a in (14) may be obtained directly by a consideration of the types 
of permutation to which the x’ B oonoemed refer. Any solution is, 
apart from a certain factor, what is called in group theory a character 
of the group of permutations. The x’s are all real dynamical variables, 
since each P and its conjugate oomplex P -1 are similar and will occur 
added together in the definition of any x> 80 that the x”s must be all 
real numbers. 

The number of possible solutions of the equations (14) may easily 
be determined) since it must equal the number of different eigen- 
values of an arbitrary function B of the x’s. We can express B as 
a linear function of the x’s with the help of equations (14); thus 

■® = &lXl+^*X*+—+6|»Xm* ( 18 ) 



216 SYSTEMS CONTAINING SEVERAL SIMILAR PARTICLES § 56 


Similarly, we oan express each of the quantities B 2 f 5 s ,..., BP as a 
linear function of the x’s. From the m equations thus obtained, 
together with the equation x(Pi) = 1, we can eliminate the m un- 
knowns Xv Xm> obtaining as result an algebraic equation of 

degree m for B, 

B m +c 1 5 m “ 1 +c a B m - 2 +...+c m = 0. 

The m solutions of this equation give the m possible eigenvalues 
for B t each of whioh will, according to (15), be a linear fiinotion of b v 
b m whose coefficients are a permissible set of values xv Xa>***» Xm* 
The sets of values x thus obtained must be all different, since if 
there were fewer than m different permissible sets of values x for the 
x’s, there would exist a linear function of the x’s every one of whose 
eigenvalues vanishes, which would mean that the linear function itself 
vanishes and the x’s are not linearly independent. Thus the number of 
permissible sets of numerical values for the x’s is just equal to m, whioh 
is the number of classes of permutations or the number of partitions 
of n. This number is therefore the number of exclusive sets of states. 

All dynamical variables of physical importance and all observable 
quantities are symmetrical between the particles and thus commute 
with all the P’s. Thus the only functions of the P’s of physical 
importance are the x’ 8 - The states corresponding to |x'> and to 
/(P) lx'>» where |x'> is any eigenket of the x’s belonging to the eigen- 
values x' and/(P) is any function of the P’s such that /(P)|x'> =£ 
are observationally indistinguishable and are thus physically equiva- 
lent. There is a definite number, n(x') say, of independent kets which 
oan be formed by multiplying |x'> by functions of the P’s, which 
number depends only on the x ’ B - If is the number of rows and 
oolumns in a matrix representation of the P’s in whioh each x is 
equal to x'* If |x'> corresponds to a stationary state, n(x') will be 
, its decree of degeneracy (so far as concerns degeneracy caused by the 
symmetry between the particles). Thisdegeneraoy cannot be removed 
by any perturbation that is symmetrical between the particles. 

57. Determination of the energy-levels 

Let us apply the perturbation method of §43 and make a first-order 
calculation of the energy-levels in the oase when the Hamiltonian 
does not involve the time explicitly. We suppose that for our unper- 
turbed stationary states of the assembly each of similar particles 
hearts own individual state. With n particles, we shall have n of 
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these states, corresponding to kets la 1 ), |a 8 ),..., |a n > say, which we 
assume for the present to be all orthogonal. The ket for .the assembly 7 ^ 

like (1) with a 1 , a 2 ,... instead of a, 6,... . IT we apply any permutation 
P to it we get another ket A 


P|X> » |a*>K>...|aff> (17), 

say, r, «,..., z being some permutation of the numbers 1 , 2 ,..., n, 1 
corresponding to another stationary state of the assembly with the 
same energy. There are thus altogether n\ unperturbed states with 
this energy, if we assume there are no other causes of degeneraoy. 
According to the method of § 43 when the unperturbed system is 
degenerate, we must consider those elements of the matrix represent- 
ing the perturbing energy V that refer to two states with the same 
energy, i.e. those of the type (X \P a VP h |X>. These will form a matrix 
with n\ rows and columns, whose eigenvalues are the first-order 
corrections in the energy-levels. 

We must now introduce another kind of permutation operator 
which can be applied to kets of the form (17), namely a permutation 
whic h acts on the indices of the ex’s. We denote such a permutation 
operator by P a . The essential difference between the P’s and the 
4 P“’s may be seen in the following way. Let us consider a permutation 
in the general sense, say that consisting of the interchange of 2 and 3. 
This may be interpreted either as the interchange of the objects 2 and 
3 or as the interchange of the objects in the plaoes 2 and 3, these two 
operations producing in general quite different results. The first of 
these interpretations is the one that gives the operators P, the objects 
concerned being the similar particles. A permutation P can be 
applied to an arbitrary ket for the assembly. A permutation with the 
second interpretation has a meaning, however, only when applied 
to a ket of the form (17), for which each of the particles is in a ‘ jjlaoq * 
specified by an a, or to a sum of kets of the form (17). A permutation 
P may be considered as an ordinary dynamical variable. A permuta- 
tion P“ may be considered a dynamical variable in a restricted 
sense, valid when one is dealing only with states obtainable by super- 
position of the various states (17). This is the case for our present 
perturbation problem. 

We can form algebraic functions of the P® which will be other 
operators applicable to kets of the form (17). In particular we can 
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form x(P?), the average of all P*'i s in a certain class c. This must 
equal the average of the permutation operators P in the same 
class, since the total set of all permutations in a given class must 
evidently be the same whether the permutations are applied to the 
particles or to the plaoes the particles are in. Any P commutes with 


any P“, i.e. 


P a Pg = PgP a . 


(18) 


By labelling the ol s by the same numbers 1, 2, 3,..., n which label 
the particles, we set up a one-one correspondence between the as and 
the particles, so t hat g iven any permutation applying t o the par- 
tifila^WfiJpanjgive a meaning jg the game permutation Pg applyin g 
to the a’s. This meaning is such that, for the ket \X) given by (16), 

PSPJZ) = |X>. (19) 

Since the various kets la 1 ), |a 8 ),... are orthogonal, |X> and P|X> are 
orthogonal unless P = 1. It follows that, for any coefficients c P) 

£ Cj *<Z|P“P a |Z> = c P , (20) 


provided |X> iB normalized, the summation being over all the n\ 
permutations P or P“, with P tt fixed. Now define V P by 

V P =a\VP\X\ (21) 

We then have, for liny two permutations P x and P y , 

<Z|J?FP V |Z>' = <Z|FP ae P„|Z> = F P . P , 

= JF^ZIP-P^PJZ) 

with the hUp of (20). From (18) this gives 

<Z|P x FP y |X> = JF P <Z|P x P«P y |Z>. (22) 

We may write this result as 

V « £Fp<P“, (23) 

whei#^Ee sign & means an equation in a restricted sense, the 

{operators on the two sides being equal so long as they are used only 
with kets of the form P|X> and their conjugate imaginary bras. 

The*formula (23) shows that the perturbing energy F is equal, in 
the restricted sense, to a linear function of the permutation operators 
P* mth coefficients V P given by (21). The restricted sense is adequate 
for the calculation of the first-order correction in i)ie energy-levels, 
as t 4os calculation involves only those matrix elements of F given by 
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(22). The formula (23) is a very convenient one because the expression 
on its right-hand side is easily handled. 

As an example of an application of (23) we shall determine the 
average energy of all those states, arising from the unperturbed state 
(16), that belong to one exclusive set. This requires us to calculate 
the average eigenvalue of V for those states (17) for which the x’s 
have specified numerical values Now the average eigenvalue of 
PJJ for any of these states equals that of P a P^(P a )~ 1 for arbitrary 
P* and thus equals that of rah 1 ^ P“P“(P ot )- 1 , which is x'(^S) 0T 

x'(P a )• Hence the average eigenvalue of V is J Jpx'(P)* A similar! 

method could be used for calculating the average eigenvalue of any 
function of F, it being necessary only to replace each P“ by x'{P) £° 
perform the averaging. 

The number of energy-levels in an exclusive set x ~ X that arise 
from a given state of the unperturbed system is equal to the number 
of eigenvalues of the right-hand side of (23) that are consistent with 
the equations x = X- This number is the number n(x) introduced 
at the end of the preceding section, and i# thus just the degree of 
degeneracy of the states in this set. 

We have assumed that the individual kets la 1 ) Ja 2 ),... which deter- 
mine the unperturbed state according to. (16) areuul orthogonal. JThe 
theory can easily be extended to the case when some of these kqfcs are 
equal, any two that are not equal being still restricted to be orthogonal. 
We now have some permutations P“ such that P“|X> = |X>, 
namely those permutations which involve only interchanges of 
equal a’s. Equation (20) will now hold if the summation is extended 
only over those P’s which make P a |X> different. With this ohange 
in the meaning of ^ , all the previous equations still hold, including 

the result (23). For the present \X} there will be restrictions on the 
possible numerical values of the x’s, e.g. they cannot havo#those 
values corresponding to \X) being antisymmetrical. 


58. Application to electrons 

Let us consider the case when the similar particles are electrons. 
This requires, according to Pauli’s exclusion principle discussed in 
§ 54, that we take into account only the antisymmetrical states. It 
is now necessary to make explicit reference to the fact that electrons 
have spins, which show themselves through an angular momentum^ 
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and a magnetic moment. The effect of the spin on the motion of 
an eleotron in an electromagnetic field is . not very great. There 
are additional forces on the electron due to its magnetic moment, 
requiring additional terms in the Hamiltonian. The spin an gular 
momentum does not have any direct action on the motion, but it comes 
into play when there are forces tending to rotate the magnetic moment, 
since the magnetic moment and angular momentum are constrained 
to be always in the same direction. In the absence of a strong 
magnetic field these effects are all small, of the same order of magni- 
tude as the corrections required by relativistic mechanics, and there 
would be no point in taking them into account in a non-relativistic 
theory. The importance of the spin lies not in these small effects on the 
motion of the electron, but in the fact that it gives two internal states 
to the electron, corresponding to the two possible values of the spin 
component in any assigned direction, which causes a doubling in the 
number of independent states of an electron. This fact has far-reaching 
consequences when combined with Pauli’s exclusion principle. * 

In dealing with an assembly of electrons we have two kinds of 
dynamical variables. The first kind, which we may call the or bital_ 
variables , consists of the coordinates x, y , z o f all the electrons and 
their conjugate momenta^, The second kind consists of the 

spin variables, the variables a x , cr y , a e , as introduced in § 37, for all 
the electrons. These two kinds of variables belong to different degrees 
of freedom. According to §§ 20 and 21, a ket fixing the state of the 
whole system may be of the form \A')\B') i where | Ay is a ket referring 
to the orbital variables alone and \B} is a ket referring to the spin 
variables alone, and the general ket fixing a state of the whole system 
is a sum or integral of kets of this form. This way of looking at things 
enables us to introduce two kinds of permutation operators, the first 
kind, P x say, applying to the orbital variables only and operating 
only 9H the factor \A) and the second kind, P° say, applying only 
to the Spin variables and operating only on the factor |i?>. The P®’s 
and JP*’s can each be applied to any ket for the whole system, not 
merely to certain special kets, like the P*’ s of the preceding section. 
The permutations P that we have had up to the present apply to all 
the d ynamical variables of the particles concerned, so for electrons 
they will apply to both the orbital and the spin variables. This means 
that eaoh^FJ, equals the product 

P P* pa 

*a * a * «• 


( 24 ) 
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We can now see the need for taking the spin variables into aocount 
when applying Pauli’s exclusion principle, even if we neglect the spin 
forces in the Hamiltonian. For any state occurring in nature eaoh 
P 0 must have the value ±1, according to whether it is an even or 
an odd permutation, so from (24) 

P x a P a a = ±1- (25) 

The theory of the three preceding sections would become trivial if 
applied directly to electrons, for which each P a = ±1. We may, 
however, apply it to the P x permutations of electrons. The P ff ’s are 
constants of the motion if we neglect the terms in the Hamiltonian 
that arise from the spin forces, since this neglect results in the 
Hamiltonian not involving the spin dynamical variables o at all. The 
P x * s must then also be constants of the motion. We can now intro- 
duce new x% equal to the average of aB of the P x ’b in each class, and 
assert that for any permissible set of numerical values x for these x’s 
there will be one exclusive set of states. Thus there exist exclusive sets 
of states for systems containing many electrons even when we restrict 
ourselves to a consideration of only those states that satisfy Pauli’s 
principle. The exclusiveness of the sets of states is now, of course, 
only approximate, since the x s &Te constants only so long as we 
neglect the spin forces. There will actually be a small probability for 
a transition from a state in one set to a state in another. 

Equation (25) gives us a simple connexion between the P x *a and 
P*s, which means that instead of studying the dynamical variables 
P x we can get all the results we want, e.g. the characters x> by 
studying the dynamical variables P a . The P a ’s are much easier to 
study on account of there being only two independent states of spin 
for each electron. This fact results in there being fewer characters x 
for the group of permutations of the a- variables than for the group 
of general permutations, since it prevents a ket in the spin variables 
from being antisymmetrical in more than two of them. 

The study of the P 0 ’ s is made specially easy by the fact that we 
can express them as algebraic functions of the dynamical variables a. 
Consider the quantity 

01* = H 1 +"*1 °zt+ a vi a vt+°A <*.»} = J{1 + («1>«»)}. 

With the help of equations (50) and (51) of § 37 we find readily that 

(«i>®*)* = (<\i °n+a v i °i*+®*i <V*)* = 8— 2(«i,o,), (26) 

and hence that v 


°i* = H 1 + 2 (°i> a *)+(®i.®*)*i = 1* 


<27) ; 
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Again, we find 

Olt a xl = H c xl tC7 *l Vyl + tcr i/l 

and hence 0 ia = < 7 ^ 0 12 . 

Similar relations hold for a yl and cr zl so that we have 

0i2 ®i = 0ia 

or' — °2* 

From this we can obtain with the help of (27) 


012 a 2 012 1 ““ a l* 

'Phesd commutation relations for 0 12 with o x and o a are precisely the 
same as those for P° 2 , the pepnutation consisting of$he interchange 
of the spin variables of electrons 1 and 2. Thus we can put 


0i2 — cP° 2i 

where c is a number. Equation (27) shows that c = ±1. To deter- 
mine which of these values for c is the correct one, we observe that 
the eigenvalues of ijf 2 are 1, 1, 1, —1, corresponding to the fact that 
there exist three independent symmetrical and one antisymmetrical 
state in the spin variables of two electing, namely, with the notation 
of § 37, the states represented by the three symmetrical functions 

f *( a n)f f p( cj ei)ffl( cr z2)’ f pfozx) p( a zi)f a( a ei) > end the one 
antisymmetrical function / a (^)/ J g(<7^)~/^(a^)/ a (cT^). Thus the mean 
bf the eigenvalues of Pjg is £. Now the mean of the eigenvalues of 
(®i» ®*) is evidently zero and hence the mean of the eigenvalues of 0 12 
is J. Thus we must have c = +1, and so we can put ^ 

p u = iO+^i.^)}- # (28) 

In this way any permutation P° consisting simply of an interchange 
can be expressed as an algebraic functioi^of the o’s. Any other per- 
mutation P° can be expressed as a product of interchanges and oan 
therefore also be expressed as a function of the cl's. the help of 
(25) we can now express the P x 'b as algebraic functions of the o's and 
e Bminat e the P^’s from the discussion. We iiave, since the — sign 
must be taken in (25) when the permutations are interchanges a nd 
since the square of an mtesojiange is unity, 

tfl+^o,)}. (29) 

formula fSt^jpay conveniently Jbe used for the evaluation of 
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the characters x which define the exclusive sets of states. We have, 
for example, for the permutations consisting of interchanges, 

If we introduce the dynamical variable s to describe the magnitude of 
the total spin angular momentum, J 2 a r in units of ft, through the 

r 

2 «r> *£«<)> 
in agreement with (39) of § 36, we have 

^ = 4s(s-f-l)-*3ra. 

Hence ^ 

V = 1 fi I 4a («+ I )- 3w ) n(n-4)+4a(a+l) 

*“ 2\ 1+ n(n- 1) | 2»(ra— 1) ' ' ' 

Thus X 12 is expressible as a function of the dynamical variable 8 and 
of n the number of electrons. Any of the other x’s could be evaluated 
on similar lines and would have to be a function <# 8 and n only, since 
there are no other symmetrical functions of all the o dynamical 
variables whicfy could be in*#ited. There is therefore one set of 
numerical values x for the x’s, and thus one exclusive set of states, 
for each eigenvalue s' of 8. The eigenvalues of 8 are 

\n , in- 1 , in- 2 , ..., 
the series terminating with 0 or J. 

We see in this way that each of the stationary states of a system 
with sevem electrons is an eigenstate of 8 , the magnitude in units of 
ft of the jbotal spin angular momentum i 2 0 r , belonging to a definite 

r 

eigenvalue s'. For any given s' there will be 2s' +1 possible values 
for a component of the total spin vector in any direction and these 
will correspond to 2s'+ 1 independent stationary states with the same 
energy. When we do not neglect the forces due to the spin magnetic 
moments these 2s' +1 states will in general be split up into 2s' + 1 
states with slightly different energies, and will thus form a multiplet 
of multiplicity 2s' + 1. Transitions in which s' changes, i.e. transitions 
from one multiplicity to another, oamKjp occur when the spin forces 
are neglected and will have only a small probS^Jity of Occurrence 
when the spin forces are not neglected. 


formula 


«(«+!) = (J 
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We can determine the energy-levels of a system with several 
electrons to the first approximation by applying the theory of the 
preceding section with the kets lop referring only to the orbital 
variables and using formula (23). If we consider only the Coulomb 
forces between the electrons, then the interaction energy V will 
consist of a sum of parts each referring to only two electrons, which 
will result in all the matrix elements V P vanishing except those for 
which P* is the identical permutation or is simply an interchange of 
"two electrons. Thus (23) will reduce to 


( 31 ) 

V n being the matrix element referring to the interchange of electrons 
r and s. Since the P®’ s have the same properties as the P x ’s, any 
funotion of the P“’s will have the same eigenvalues as the corre- 
sponding function of the P x ’s, so that thp right-hand side of (31) 
will have the same eigenvalues as 


dr 


Vi+gv n i% 

K~i|X{l+(«,»o.)} 


(32) 


from (29). The eigenvalues oft(32) will give the first-order oorreotions 
in the energy-levels. The form of (32) shows that a model which 
assumes a coupling energy between the spins of the various electrons, 
of magnitude — i^(o r ,o a ) for the electrons in the r and 8 orbital 
states, would meet with a fair amount of success. This coupling 
energy is much greater than that of the spin magnetic moments. Such 
models of the atom were in use before the justification by quantum 
mechanics was obtained. 

* We may have two of the orbital states of the unperturbed system 
the same, i.e. the kets | of} in the orbital variables for two electrons 
may be the same. Suppose la 1 ) and ja 1 ) are the same. Then we must 
tabs duly those eigenvalues of (31) that tpe consistent with 
or those eigenvalues of (32) that are consistent with pl = 1 or 
Pjii = — 1. From (28) this condition gives (o v a t ) = —3, so that 
(®i+ a i) 1 = 0* Thus the resultant of the two spins and ouis zero, 
which may be interpreted as the spins and o s being annparallel. 
Thus we may say that twp electrons in the same orbital state have 
their spins antiparallel. More than two electron* cannot be m the 
orbital state ^ 
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THEORY OF RADIATION 
59. An assembly of bosons 

We consider a dynamical system composed of u' particles. 

We set up a representation for one of the particles with discrete basic 
kets \oi «>, |«<*>>, |«<*>>,.... Then, as explained in § 54, we get a sym- 
metrical representation of the assembly of v! particles by taking as 
basic kets the products 

I«f>|c4>k c 3>...|^> = Wc (1) 
in which there is one factor for each particle, the suffixes 1, 2, 3,..., v! 
of the a’s being the labels of the particles and the indices a, b , c,..., g 
denoting indices (1) , (2) , in the basic kets for one particle. If the 
particles are bosons, so that only symmetrical states occur in nature, 
then we need to work with only the symmetrical kets that can be 
constructed from the kets (1). The states corresponding to these 
symmetrical kets will form a complete set of states for the assembly 
of bosons. We can build up a theory of them as follows. 

We introduce the linear operator S defined by 

S = u'HjP, (2) 

the sum being, taken over all the u'\ permutations of the v! particles. 
Then S applied to any ket for the assembly gives a symmetrical ket. 
We may therefore call S the symmetrizing operator . From (8) of § 65 
it is real. Applied to the ket (1) it gives 

u' H2P|afffl4oS...ofr> = (3) 

the labels of the particles being omitted on the right-hand side as 
they are no longer relevant. The ket (3) corresponds to a state for 
the assembly of u' bosons with a definite distribution of the bosons 
among the various boson states, without any particular boson being 
assigned to any particular state. The distribution of bosons is speci- 
fied if we specify how many bosons are in each boson state. Let 
nj, nj,, Tig,... be the numbers of bosons in the states a^, oP\... 

respectively with this distribution. The n”s are defined algebraically 
by the equation 

a°-fa 6 +a c +...+a i7 = a^-f-W'8 • (4) 

Thensum of the n ” s is of course u\ The number of n ” s is equal to 
the number of basic kets (a^), which in most applications of the 

9595.67 o 
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theory is very much greater than u\ so most of the n” s will be zero. 
If a®, a 6 , a®,..., ofi are all different, i.e. if the ft” s are all 0 or 1, the 
ket (3) is normalized, since in this case the terms on the left-hand 
side of (3) are atL orthogonal to one another and each contributes 
ft'!” 1 to the squared length of the ket. However, if a®, a 6 , a 0 ,..., o/P 
are not all different, those terms on the left-hand side of (3) will 
be equal which arise from permutations P which merely interchange 
bosons in the same state. The number of equal terms will be 
<1 ftj! nj!..., so the squared length of the ket (3) will be 

<c^o^a^...c^|^ 2 |o£°a 6 a^...Q£ flr > = <! ftj! n ' 3 !... . (5) 

For dealing with a general state of the assembly we can introduce 
the numbers n x , n 2 , ftg,... of bosons in the states a* 1 *, a®, oP\... 
respectively and treat the ft’s as dynamical variables or as observ- 
ables. They have the eigenvalues 0, 1, 2,..., u\ The ket (3) is a 
simultaneous eigenket of all the ft’s, belonging to the eigenvalues 
fti, n 2 , ft^,.... The various kets (3) form a complete set for the 
dynamical system consisting of u* bosons, so the ft’s all commute 
(see the converse to the theorem of § 13). Further, there is only one 
independent ket (3) belonging to any set of eigenvalues rCy, ftg,... . 
Hence the ft’s form a complete^et of commuting observables. If we 
normalize the kets (3) and then label the resulting kets by the 
eigenvalues of the ft’s to which they belong, i.e. if we put 

(fti!fti!fti!...)^iSf|o^a 6 a c ...o^> = Kftifti-..), (6) 

we get a set of kets Ift^ftg...), with the ft'*s taking on all non-negative 
integral values adding up to u\ which kets will form the basic kets 
of a representation with the ft’s diagonal. 

The ft’s can be expressed as functions of the observables <x v a a , 
Qig,..., ot U ’ which define the basic kets of the individual bosons by 
means of the equations 

». = 2U. (?) 

or the equations 2 n o/(«°) = 2/(<*r) (8) 

o r 

holding for any function/. 

Let us now suppose that the number of bosons in the assembly is 
not given, but is variable. This number is then a dynamical variable 
or observable u , with eigenvalues 0, 1, 2,..., and the ket (3) is an 
eigenket of u belonging to the eigenvalue u\ , To get a complete 
set of kets for our dynamical system we must now 'take all the 
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symmetrical kets (3) for all values of u'. We may arrange them in 
order thus |>> | a a >) <Sja a a 6 >( iS j a a a 6 a c >j (9) 

where first is written the ket, with no label, corresponding to the 
state with no bosons present, then come the kets corresponding to 
states with one boson present, then those corresponding to states 
with two bosons, and so on. A general state corresponds to a ket 
which is a sum of the various kets (9). The kets (9) are all orthogonal 
to one another, two kets referring to the same number of bosons being 
orthogonal as before, and two referring to different numbers of bosons 
being orthogonal sinGe they are eigenkets of u belonging to different 
eigenvalues. By normalizing all the kets (9), we get a set of kets like 
( 6 ) with no restriction on the ri * s (i.e. each n' taking on all non- 
negative integral values) and these kets form the basic kets of a 
representation with the n ' s diagonal for the dynamical system con- 
sisting of a variable number of bosons. 

If there is no interaction between the bosons and if the basic kets 
|a (1) >, |a* 2) >,... correspond to stationary states of a boson, the kets (9) 
will correspond to stationary states for the assembly of bosons. The 
number u of bosons is now constant in time, but it need not be a 
specified number, i.e. the general state is a superposition of states 
with various values for u. If the energy of one boson is H( a), the 
energy of the assembly will be 

!*(*) = 2 (10) 

r a 

from ( 8 ), H a being short for the number H(cx a ). This gives the 
Hamiltonian for the assembly as a function of the dynamical 
variables n. 

60. The connexion between bosons and oscillators 

In § 34 we studied the harmonic oscillator, a dynamical system of 
one degree of freedap describable in terms of a canonical q and p, 
such that the Harffitonian is a sum of squares of q and p, with 
numerical coefficients. We define a general oscillator mathematically 
as a system of one degree of freedom describable in terms of a 
canonical q and p, such that the Hamiltonian is a power series in q 
and p, and remains so if the system is perturbed in any way. We 
shall now study a dynamical system composed of several of these 
oscillators. We can describe each oscillator in terms of, instead of 
q and p t a complex dynamioal variable 77 , like the 77 of § 34, and its 
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conjugate complex rj, satisfying the commutation relation (7) of 
& 34. We attach labels 1, 2, 3,... to the different oscillators, so that 
the ^hdUr set of oscillators is doscribable’in terms of the dynamical 


relataofjB*'' 


Vv satisfying the commutation 

Va r )b—nbVa = 0 . 

VaVb- ybVa = 0, } ( 11 ) 


Put 
so that 


VaVb-Vb Va = &ab- 
Va Va = ^a» 

VaVa^na+l. 


( 12 ) 

(13) 


The n ’ s are observables which commute with one another and the 
work* of § 34 shows that each of them has as eigenvalues all non- 
negative integers. For the ath oscillator there is a standard ket, |0 a > 
say, which is a normalized eigenket of n a belonging to the eigenvalue 
zero. By multiplying all these standard kets together we get a 
standard ket for the set of oscillators, 


|0 1 >|0 a >|0 8 >... = 10,0,03...), (14) 

which is a simultaneous eigenket of all 6 the n y s belonging to the 
eigenvalues zero. The standard ket (14) will be much used in the 
future and will be denoted simply by } s . From (13) of § 34 


Va>S = ° 

for any a. The work of § 34 also shows that, if n [, n' 2 , 
non-negative integers, 


(16) 
are any 
(16) 


is a simultaneous eigenket of all the n ’ s belonging to the eigenvalues 
nj, n 2i respectively. The various kets (16) obtained by taking 
different n ” s form a complete set of kets all brthogonal to one another 
and the square of the length of one of them is, from (16) of § 34, 
n^L. . From this we see, bearing in mind the result (6), that 
the kets (16) have just the same properties as the kets (9), so that 
~*we can equate each ket (16) to the ket (9) referring to the same n f 
values without getting any inconsistency. This involves put ting 

, S\a?afia?..,a?iy = VaVbVc-V tt >B- (17) 

The standard ket y g becomes equal to the first of the kets (9), corre- 
sponding to no bosons present. 

The effect of equation (17) is to identify the spates 6f an assembly 
6£ bosons with the states of a set of oscillators. This means that the 
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. V *, 

* * • * 
dynamical system consis ti ng of cm assembly of jrimilar bosons i $egu iya- 

lent to the dynamical system consisting of a set of osciVators^heJ ^ gg 

systems a re just the same system looked at from two differe nt points of * 

view k There js one^sciUator associated with each indep^ndeS^oaDn 

state. We ha^here one of the most fundamental resuKs ©fiquantuia 

mechanics, which enables a unification of the wave $pd corpusciilar 

theories of light to be effected. * 

Our wdrk m the preceding section was built up on a discrete ( set 

of basic ketsf |a°) for a boson. We could pass to a different discrete 

set of basic kets, \f2 A } say, and build up a similar theory on them. 

The basie kets for the assembly would then be, instead of (9), 


i>, i/h>, s im b\w*f> 

The first of the kets (18), referring to no bosons present, is the same 
as the first of the kets (9). Those kets (18) referring to one boson 
present are linear functions of those kets (9) referring to one boson 
present, namely — £ | a «><a 0 |j 3 ^>, (19) 


and generally those kets £| 8 ) referring to u' bosons present are linear 
functions of those kets (9) referring to u' bosons present. Associated 
with the new basic states \p A y for a boson there will be a new set 
of oscillator variables rj A , and corresponding to (17) we shall have 


S\p A p*/F...y = rj A ri B ri c ...> s . ( 20 ) 

Thus a ket.77^ 77*.. .> s with u ' factors 77 Ai must be a linear func- 
tion of kets rj a 7 j b ...} 8 with u f factors rj a , . It follows that each 

linear operator tj a must be a linear function of the rjf s. Equation 

r,4>a = 2la>a<« a \P A > 


and hence Va — 2 r ( 21 ) 

, a 

I Thus the rfs tr ansf orm a ccording to t he same 

J a boson . The transformed ^ ’s satSEy, with their conjugate complexes, 
the same commutation relations ( 11 ) as the original ones. The trans- 
formed 77 ’s are on just the same footing as the original ones and hence, 
when we look upon our dynamical* system as a set of oscillators, the 
different degrees of freedom have no invariant significance. 

The ij’s transform according to the same law as the basic bras for 
a boson, and' thus the same law as the numbers <a a |#> forming the 
representative of a state x. This similarity people often describe by 
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saying that the* are given hf **rapcess of sec o nd q uantization 
appliedta *< 0 ®^), meaning there%$bat, affce* one has set up a 
quantun\theoryfor a single particle and so introduced the numbers 
* <a 5 *|a;> representing a state of the particle, oqe can make these num- 
f bers ihto linear operators satisfying with their conjugate complexes 
.ih© correct commutation relations, like ( 11 ), and one then has the 
appropriate mathematical basis for dealing with an assembly of the 
particles, jtf^ovided they are bosons. There is a corresponding proce- 
dure for fermions, which will be given in § 65. 

Since an assembly of bosons is the same as a set of oscillators, it 
must be possible to express any symmetrical function of the boson 
variables in terms of the oscillator variables 77 and 77 . An example 
oftBhis is provided by equation (10) with 7] a rj a substituted for n a . 
'Let us see how it goes in general. Take first the case of a function 
of #be boson variables of the form 

U T = 2 U r> (22) 

• . r 

where each U r is a function only of the dynamical variables of the 
rth boson, so that it has a representative <a£| £7 r |a£> referring to the 
basic kets |o£> of the rth boson. In order that U T may be symmetrical, 
this representative must be the same for all r, so that it can depend 
only on thejbwo eigenvalues labelled by a and 6 . We may therefore 

^ «<UM> = <a°|J7|a 6 > = <a|Z7|6> (23) 

*fdr brevity. We have 

U r \a%ofr...y = 2 \c$c£jj*..d}.,'>(a\U\x r '). (24) 

' a * 

Su mming this equation fot all values of r and applying the sym- 
metrizing bperator S to bothaMes, we get 

°U T = 22 a?«..a“.><a|C7|a; r >. (26) 

Since XJ T is symmetrical we can replace SU T by U T S and can then 
substitute for the symmetrical ketf in (26) their, values given by (17). 
We get imthis way 

= 22 VaVx^Vx, Vz,->S< a \U\z r > 

a r 

’= ^ la 2 V+Vxt Vx,->B *bx:'<#\U\b>, (26) 

„ Ui 1 nieaniJI that the factor 77 ^ mnst be cancelled d^t. Now from 
the commutation relations ( 11 ) 

6 Vxi 7 ?fct"*)s = 2 Vx^Vxi 7 ?x.---)s ^bxr (2?) 
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(note that rfo is like tRe opera^rof partial difforeiffciation d/drj b ) 9 so* 

(26) becomes 7'^ ? > V * , «* . 

% Vxt iC-^a = ^ v a ib y Xl v^sWUlby. (28) 

The kets tj Xi form a complete set, and hence we can infer from 

(28) the operator equation 

U T = '£r, a <a\U\b>rj b . .(29) 

This gives ¥ us U T in terms of the 77 and 77 variables and the matrix 
elements <a|C7|6>. 

Now let us take a symmetrical function of the boson variables < 
consisting of a sum of terms each referring to two bosons, 

V T = I Vrs- (3p)’ 

We do not need to assume V„ = V^. Corresponding to (23), V r9 has - 
matrix elements <a » = <ab \ V \cd> (31) 

for brevity. Proceeding as before we get, corresponding to (26), 

i9T£,|a?af*...> = £ T /8|of , of , ..a?..o(J..><o6|F|a: r a:,> (32) 

r t 8*r ab 

and corresponding to (26) 

yTVxiVx^^S = S c* r 8 dx,< a &|PM>- ( 33 ) 

We can deduce as an extension*of (27) £ 

VoVdVxiVx^^ys == Z Vx^Vxi^xi ^avXs ^ ( 3 ^) 

so that (33) becomes 

giving us the operator equation ' ' ' , 

% = Tjla l5 r M>’?o V* . . (? 6 ) 


The method can readily be extended to give any Symmetrical func- 
tion of the boson variables in terms of the ??’s and rj’ s. 

The foregoing theory can eulgfy be generalised to apply to an 
assembly of bosons in interaction with some other dynamical system, 
which wwfehalleall for definiteness the atom. We must introduce a 
set of basio kets,iD say, for the atom alone.^We can then get a apt* 
of basic fcpbs fc*§the whole system of atom and bosons together by^ 
n foltip lyin g each of the kets |(7 into each of the kets We may 
Write these kets 

ID, SID 0 * 6 ), S IJWoO (M) 
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We may look upon the system as doniposed of the atom in interaction 
with a jet of oscillators, so that it oan be described in terms of the 
atom variables and the oscillator variables 7j aJ rj a . Using again the 
standard ket >5 for the set of oscillators, we have 

= VaViVc-y S Wf, (37) 

corresponding to (17), as the equation expressing the basic kets 
(36) in terms of the oscillator variables. 

Any function of the atom variables and boson variables which is 
symmetrical between all the bosons is expressible as a function of the 
atom variables and the n’s and ij’s. Consider first a function XJ T of 
the form (22) with U r a function only of the atom variables and the 
variables of thfc rth boson, so that it has a representative <£'«£ | U r | f "a#). 
This representative must be independent of r in order that U T may 
be symmetrical between all the bosons, so we may write it 
<£ / cx°|F|J >r a 6 >. Now let us define <a|F| 6 > to be that function of the 
atom variables whose representative is <f'a°| F | £*a 6 >, so that we have 

, <£'«? m'£> = <£V|fi r«*> = <n«»iF| 6 >ir>, m 

corresponding to (23). The equations (24)-(28) can now be taken over 
and applied to the present work if both sides of all these equations 
are multiplied by |£'> on the right, with the result that formula (29) 
stiB holds. We oan deal similarly with a symmetrical function V T of 
the form (30) with V r8 a function only of t^e atom variables and the 
variables of the rth and $th bosons. Defining <o&|F|cd> to be that 
funotion of the atom variables whose representative is 

we find that formula (35) still holds. 

61. Emission and absorption of bosons 

Let us suppose that the oscillators of the preceding section are 
harmonic oscillators and there is no interaction between them. The 
energy of the oth oscillator is then, from (5) of § 34, 

H a « fiUaVaVa+ifcOa- 

We sjhall neglec#the constant term which is |jie energy of the 
oscillator in its lowest state — th£ so-called * zero-point energy'. This 
neglect does not have any dynamical consequences, as explained at 
»the beginning of § 30, and merely invokes a redefinition of H a . The 
total energy of all the oscillators is now * , 

= 2 H a = 2 kOaVaV o = 2 


(39) 
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with the help of (12). This is of the same form as (10), with for 
H a . Thus a set ojharrrjmic oscillators is equivalent to an asshnbluM., 
bos ons in s tationary states with no interaction^ between them. IJ an 
Q 8 ciUator 4 ff the set is in its n'th quantum state L there are n' bos ons in 
the ass ociated bosoM^ate. r 

In general the Hamiltonian for the set of oscillators will be a power 
series in the variables rj a> rj ai say 

H t = Hp+ 2 (^a 7 7a+^a‘7o) + X ^^aVb+^abVaVb+%bVaVb) + ---> 

(40) 

where H P , U ai U^, are numbers, H P being real and = Tf ba . If 
the set of oscillators are in interaction with an atom /as we had at 
the end of the preceding section, the total Hamiltonian will still be 
of the form (40), with H P , U a , , Vat, functions of the atom variables, 

H P in particular being the Hamiltonian for the atom by itself. A 
general treatment of this dynamical system would be rather compli- 
cated and for practical applications one assumes that the terms 

Bp+ 2 U aa 7 Ia ij a (41) 

a 

are large compared with the others and form by themselves an 
unperturbed system, the remaining terms being taken into account 
as a perturbation pjfctfeing transitions in the unperturbed system, 
according to the theory on[ 44. If, further, is independent of the 
atom variables, the unperturbed system with Hamiltonian (41) con- 
sists merely of an atom with Hamiltonian H P and an assembly of 
bosons in stationary states with Hamiltonian of the form (39), with 
no interaction. 

Let us consider what kinds of transitions are produced by the 
various perturbation terms in (40). Take a stationary state of the 
unperturbed system for which the atom is in a stationary state, £' say, 

and bosons Are present in the stationary boson states, a, b , c This 

stationary state for the unperturbed system corresponds to the ket 

( 42 ) 

like (37). If the term U x rj x of (40) ^8 multiplied into this ket, the 
result is a linear combination of kets like 

Vx Va Vb (43) 

£* denoting any stationary state of the atom. The ket (43) refers to 
one more boson than the ket (42), the extra boson being in the state x. 
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Thus the perturbation term U x rj x gives rise to transitions in which 
one boson is emitted into state x and the atom makes an arbitrary 
jump. If the term U x rj x of (40) is multiplied into (42), the result is 
zero unless (42) contains a factor tj x and is then a linear combination 
of kets like 


Vx'VaVbVc- 




referring to one boson less in state x . Thus the perturbation term 
U x rj x gives rise to transitions in which one boson is absorbed from 
state x , the atom again making an arbitrary jump. Similarly, we find 
that a perturbation term tj x rj v (x ^ y) gives rise to processes in 
which a boson is absbrbed from state y and one is emitted into state 
x, or, what is the same thing physically, one boson makes a transition 
from state y to state x. This kind of process would be produced by 
a term like the U T of (22) and (29) in the perturbation energy, pro- 
vided th^diagonal elements (a | U |a> vanish. Again, the perturbation 
terms rj x rj y give rise to processes in which two bosons are 

emitted or absorbed^ and so on for more complicated terms. With 
any of these emission and absorption processes the atom can make 
an arbitrary jump. 

Let us determine how^bhe probability of occurrence of each of these 
tranutfon* processes depends on the numbers of bosons originally 
present*Sn the various boson states. From §§ 44, 46 the transition 
probability is always proportional to the square of the modulus of 
the matrix element of the perturbation energy referring to the two 
states concerned. Thus the probability of a boson being emitted into 
state x with the atom making a jump from state £' to state £* is 
proportional to 

\<n<«~K+ i)..\u x ifcX (44) 

the n” s being the numbers of bosons initially present in the various 
boson states. Now from (6) and (17), with reference to (4), 

Vi *;«£■•> = (45) 

b® that (46) 

„ 

Hence (44) is equal to '& n 1 * j - 

' \ , (47) 

' * * r 

showing that the probability of a transition in which a boson is emitted * 
into state x is proportional to the number of bo§^^originklly in state x 
pins one. 
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The probability of a boson being absorbed from state x with the 
atom making a jump from state £' to state tf is proportional to 

• ( 48 ) 

the n”s again being the numbers of bosons initially present in the 
various boson states. Now from (45) 

so (48) is equal to fQ<C\ %ipi 8 . (50) 

Thus the probability of a transition in which a boson is absorbed from 
state x is proportional to the number of bosons originally in state x. 
Similar methods may be applied to more complicated processes, 
and show that the probability of a process in which a boson makes 
a transition from state y to state x (x ^ y) is proportional to n y (n x + 1 ) . 
More generally, the probability of a process in which bosons are 
absorbed from states x, y,... and emitted into states a, 6,... is propor- 


tional to 


<^...«+i)K+i)...,^ 


' (51) 


the n”s being in each case the numbers of bosons originally present. 
These results hold both for direct transition processes and transition 


processes that take place through one or nfore intermediate states, 


in accordance with the interpretation given at the end of § 44. 


62. Application to photons 

Since photons are bosons, the foregoing theory can be applied to 
them. A photon is in a stationary state when it is in an eigenstate 
of momentum. It then has two independent states of polarization, 
which may be taken to be two perpendicular states of linear polariza- 
tion. The dynamical variables needed to describe the stationary 
states are then the momentum p, a vector, and a polarization variable 
1, consisting of a unit vector perpendicular to p. The variables p and 
1 take the place of our previous a’s. The eigenvalues of p consist of 
all numbers from —oo to oo for each of the three Cartesian com- 
ponents of p, while for each eigenvalue p' of p, 1 has just two 
eigenvalues, namely twp arbitrarily chosen vectors perpendicular 
to p' and to one another. Owingj, the eigenvalues of p forming 
a continuous range, there are a cOu^inuous range of stationary 
states, living us the continuous basic kets |pT>. However, the fore- 
going thefcry^was built up in terms of discrete basic kets | a'> for a 
boson. There arejwo formalisms which one may use for getting over 
this discrepancy. 



236 


THEORY OF RADIATION 


The first consists in replacing the continuous three-dimensional 
distribution of eigenvalues for p by a large number of discrete points 
lying very close together, forming a dust spread over the whole three- 
dimensional p -space. Let 8 P > be the density of the dust (the number 
4 pf points per unit volume) in the neighbourhood of any point p'. 
Then 8 p , must be large and positive, but is otherwise an arbitrary 
function of p'. An integral over the p-space may be replaced by a 
sum over the dust of points, in accordance with the formula 


JJJ /( P') dp x dp' v dp' z = 2/( P')®? 1 , (62) 


which formula provides the basis of the passage from continuous p' 
values to discrete ones and .vice versa. Any problem can be worked 
out in terms of the discrete p' values, for which the theory of §§ 59-61 
can be used, and the results can be transformed back to refer to con- 
tinuous p' values. The arbitrary density 8 P > should then disappear 
from the results. 


The second formalism consists in modifying the equations of the 
theory of §§ 59-6^6$, as to make them apply to the case of a con- 
tinuous range of basic kets |a'>, by replacing sums by integrals and 
replacing the 8 symbol in the commutation relations (11) by 8 func- 
tions, so far as concerns the variables with continuous eigenvalues. 
Each of these formalisms has some advantages and some disadvan- 
tages. The first is usually more convenient for physical discussion, 
the second for mathematical development. Both will be developed 
here and one or other will be used according to which is more suitable 
at the moment. 

The Hamiltonian describing an assembly of photons interacting 
witty an atom will be of the general form (40), with the coefficients 
Hp, U a , Urf, V& involving the atom variables. This Hamiltonian may 


be written 


H t = H p +Hq+H r , 


(53) 


where H P is the energy of the atom alone, H R is the energy of the 
assembly of photons alone, 


H r =>^Vr (64) 

vy being the frequency of a photon of momentum p', and H Q is the 
interaction energy, which can be evaluated frqpa analogy with the 
classical theory, as will be shown in the next section. The whole 
4 System can be treated by a perturbation method as discussed in the 
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preceding section, H P and H R providing the energy (41) of the 
unperturbed system and Hq being the perturbation energy, which 
gives rise to transition processes in which photons are emitted and 
absorbed and the atom jumps from one stationary state to another. 

We saw in the preceding section that the probability of an absorp- 
tion process is proportional to the number of bosons originally in the 
state from which a boson is absorbed. From this we can infer that 
the probability of a photon being absorbed from a beam -of radiation 
incident on an atom is proportional to the intensity of the beam. 
We also saw that the probability of an emission process is propor- 
tional to the number of bosons originally in the state concerned plus 
one. To interpret this result we must make a careful study of the 
relations involved in replacing the continuous range of photon states 
by a discrete set. 

Let us neglect for the present the polarization variable 1. Let 
|p'D> be the normalized ket corresponding to the discrete photon 
state p\ Then from (22) of § 16 

2 Ip'd)<p'd| = i, - 

whioh gives from (62) 

J |p'D><p'D|Sp.d®p' = 1, (66) 


d 8 p' being written for dp x dp y dp' z , for brevity. Now if [p')is the basic 
ket corresponding to the continuous state p', we have according to 
(24) of §16 f 

f IP'XP'W = 1, 

which shows, on comparison with (66), that r " 


Ip ') = Ip'd)^. v (66) 

The connexion between |p'> and |p'd> is like the connexion between 
the basic kets when one changes the weight function of the representa- 
tion, as shown by (38) of § 16. 

With fty photons in each discrete photon state p', the Gibbs 
density p for the assembly of photoi^s is, according to (68). of § 33, 

P = £ |P'D>«i-<p'D| = J |p'D>»Jtf<p'D|«p. <$*P' 

= / iP'>»5'<P'l <**P' (67) 

with the help of (66). The number of photons per unit volume in the 
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neighbourhood of any point x' is then <x'|p|x'>, according to (73) 
of § 33. From (57) this equals 

<x'|p|x'> = J <x'|p'>^.<p'|x'>d*p' 

= f A-*n' p .d» p' 


(58) 


if one puts in the value of the transformation function <x'|p'> given 
by (54) of § 23. Equation (58) expresses the number of photons per 
unit volume as an integral over the momentum space, so the inte- 
grand in (58) can be interpreted as the number of photons per unit 
of phase space. We obtain in this way the result that the number of 
[ 'photons per unit of phase spac e is equal toh- 3 times the numb er of 
photons per discrete state, in other words, a c&ll of. yolume h 3 in phase 
s pace is equiva l ent id a discrete state . This result is a general one, 
holding for any kind of particle. If the polarization variable of the 
photons is n&t neglected, the result holds for each of the two indepen- 
dent states of polarization. 

The momentum of a photon of frequency v is of magnitude hv/c, 
so the element of momentum space 

dp x dp y dp B — h 3 c~ 3 v 2 dvdw, 

da) being an element of solid angle for the direction of the vector p. 
Thus a distribution of photons with n' p per discrete state, which is 
equivalent to a distribution of h- 3 n'pd 3 pd 3 x photons in an element 
of volume d?x and an element of momentum space d 3 p, equals a 
distribution of UpC~ 3 ^ dvdtod 3 x photons in an element of volume d 3 x 
and a frequency range dv and direction of motion dw. This corre- 
sponds to an energy density n p hc~V per unit solid angle per unit 
frequency range, or an intensity per unit frequency range (i.e. an 
energy crossing unit area per unit time per unit frequency range) of 
amount 




( 59 ) 


The result that the probability of a photon being emitted is pro- 
portional to n'pj-fl, n'pj being the number of photons initially present 
in the discrete state concerned, can now be interpreted as the proba- 
v bility being proportional to I A +hifi/c 2 , where I vl is the intensity of 
the inoident radiation per unit frequency range in the neighbourhood 
of the frequency of the emitted photon and having the same polariza- 
tion 1 as the emitted photon. Thus with no inoident radiation there 
is still a certain amount of emission, but the emission is increased or 
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stimulated by incident radiation in the same direction and having the 
same frequency and polarization as the emitted radiation. The 
present theory of radiation thus completes the imperfect one of § 45 
by giving both stimulated and spontaneous emission. The ratio it 
gives for the two kinds of emission, namely I vi : Ai^/c 2 , is in agreement 
with that provided by Einstein’s theory of statistical equilibrium 
mentioned in § 45. 

The probability of a photon being scattered from the state pT to 
the state pT is proportional to w pT (w p . r +l), the ri s being the 
numbers of photons initially in the discrete states concerned. We can 
interpret this result as the probability being proportional to 

Iyy(I V T + hv" z l c a ) . (60) 

Similarly for a more general radiative proces#fin which several 
photons are emitted and absorbed, the probability is proportional 
to a factor I vl for each absorbed photon and a factor I vl -\-hiPjc % for 
each emitted photon. Thus the process is stimulated by incident 
radi^ftion in the same direction and with the same frequency and 
polarization as any of the emitted photons. 


63. The interaction energy between photons and an atom 

We shall now determine the interaction energy between an atom 
and an assembly of photons, i.e. the H Q of equation (53), from 
analogy with the classical expression for the interaction energy 
between an atom and a field of radiation. For simplicity we shall 
suppose the atom to consist of a single electron moving in an electro- 
static field of force. The field of radiation may be described by a 
scalar and a vector potential. These potentials are to a certain extent 
arbitrary and may be chosen so that the scalar potential vanishes. 
The field is then completely described by the vector potential A x , A yi 
A e , or A. The change that the fi^d causes in the Hamiltonian 
describing the atom is now, as explained at the beginning of § 41, 


This is the classical interaction energy. The A that ocours here should 
be the value of the vector potential at the point where the electron is 
momentarily situated. It is, however, a good enough approximation 
if we take this A to be the vector potential at some fixed point in the 
atom, such as the nucleus, provided we are dealing with radiation 
whose wavelength is large compared with the dimensions of the atom. 
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Let us first consider the field of radiation classically and ignore its 
interaction with the atom. The vector potential A satisfies, according 
to Maxwell’s theory, the equations 

□ A = 0, divA = 0, (02) 

□ being short for d 2 lc 2 dt 2 —d 2 ldx 2 —d 2 ldy 2 —d 2 ldz 2 . The first of these 
equations shows that A can be resolved into Fourier components in 
the form 

K A = j {A k e-#i*>+ 2 ***<+X k d* k, (03) 

each Fourier component representing a train of waves moving with 
the velocity of light, described by a vector k whose direction gives 
the direction of motion of the waves and whose magnitude |k| is 
connected with their frequency v k by 

27rv k = c|k|. (04) 

The vector k is just the momentum of a photon which the quantum 
theory would associate with these waves, divided by H. For each 
value of k we have an amplitude A k , which is in general a complex 
vector, and the integral in (03) extends over the whole of the three- 
dimensional k-space. The second of equations (02) gives 

(k,A k ) = 0, (05) 

showing that for each value of k, A k is perpendicular to k. This 
expresses that the waves are transverse waves. A k is determined by 
its two components in two directions perpendicular to each other and 
to k, these two components corresponding to two independent states 
of linear polarization. 

The total energy of the radiation is given by the volume integral 

H b = (8ir) -1 J (&+#*) d?X (66) 

taken over the whole of space*, where the electric field € and the 
.magnetic field M of the radiation are given by 

S = — - M = curl A. (07) 

c o t 

Using standard formulas of vector analysis, we have 

div[Ax*] = (*, curl A)-(A, curl*) = **- (A, curl curl A) 

“ **+(A, V*A) 

with the help of the seoond of equations (62). Thus (66) becomes. 
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with neglect of a term which can be transformed to a surface integral 
at infinity, 

^ = (*.)-/{i(f,^)-(*,VA)}^ (M) 

By substituting for A here its value given by (63), we can get the 
energy of the radiation in terms of the Fourier amplitudes A k . The 
energy of the radiation is constant (since we are now ignoring the 
interaction of the radiation and the atom), so in thia calc ulat ion we 
may take 2 = 0. This means taking 

A = J (A k +X_ k )e-<»*> d*k, (68) 

V*A = — J k»(A k +X_ k )e-‘**>d®k ) 

8A/8t = ic j |k|(A k -5_ k )e-«w & k. (70) 

Inserting these expressions in (68), we get 

H r = (87t) — 1 J JJ {k' 2 (A k +X_ k , A k ,+X_kO — 

-|k||k'|(A k -X_ k , d 8 kd*k'd*x 

= tt 2 jj {k' 2 (A k +X_ k , A^+X^) — 

‘ - |k| |k'|(A k — X_ k , A^-X^JSfk+k') cPfoPk', 

with the help of formula (49) of § 23, 8(k+k') being the product of 
three factors, one for each component of k. Hence 

H r = i t 2 j k 2 {(A k +X_ k , A_ k +X k ) — (A k — X_ k , A_ k — X k )} d 8 k 
= 2 W » J k*{(A k) I k )+(A_ k ,S_ k )}d»k 


= 4^1 k*(A k) S k ) cPk. (71) 

We can replace the continuous distribution of k-values by a dust of 
di|crete k-values, like we did with the p-values in the preceding 
section. The integral (71) then goes over, according to formula (52), 
into the srnn = 4tt* £ k*(A k) X k )«f*. 

s k being the density of the discrete k-values. We may also write 
this as 


Hr = 47T 2 ^ kM^ lu^ 1 , 


(72) 


being a component of A k in a direction 1 perpendicular to kand 

IM4T ft 
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the summation with respect to 1 referring to two directions 1 perpen- 
dicular to each other. Thus there is one term in (72) for each inde- 
pendent stationary state for a photon. 

The field quantities € and M at any point x can be looked upon 
as dynamical variables. The quantities 

= A kl e*” iv *t, Am = A^e-*^ 

are then dynamical variables at time t, since they are connected with 
8 and M at various points x at time t by equations which do not 
involve t, as follows from (63) and (67). A^ is constant, so Am varies 
with t according to the simple harmonic law. Thus Am is like the rj t 
of a harmonic oscillator, defined by (3) of § 34, the co of the oscillator 
being 2 nv k . We may take each Am to be proportional to the rj t of 
some harmonic oscillator and then the field of radiation becomes a 
set of harmonic oscillators. 

Let us now pass over to the quantum theory and take the Am , Am 
to be dynamical variables in the Heisenberg picture. The expression 
(72) for the energy may be retained unchanged, the order in which 
the faotors A Ui A kl there occur being the correct one to give no zero- 
point energy. The Am then still vary with time according to the 
law and may still be taken to be proportional to the 77 /s of harmonic 
osofifcktprs. The factor of proportionality may be obtained by equat- 
ing (72) to the expression (39) for the energy, with the label a replaced 
by the two labels k and 1 and with hv k for hw a . This gives 

4tt # g V^AmAm^ 1 ~ hv k VkUVkUy 

the suffix t being inserted to show that we are dealing with Heisenberg 
dynamical variables (as we should when transferring equations of the 
classical theory to the quantum theory). Hence, using (64), 

' 477*4^ = ch^vi^rjMsif ' (73) 

with neglect of an unimportant arbitrary phase factor. In this way 
the Heisenberg dynamical variables tjm> which describe the field of 
radiation as a set of oscillators, are introduced. The commutation 
relations between the tjm ^ Vut are known, being given by (11), so 
equation (73)r fixes the commutation relations between the Am and 
Am* It thwAtes the commutation relations between the potentials 
A and the Ad quantities 8 and M at various points x at the time t. 
(Incidentally, the commutation relations of the A u , are fixed, 
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so the commutation relation of two potential or held quantities at 
two different times is also fixed.) 

We can still use (73) when the interaction between the field of 
radiation and the atom is taken into account. This involves assuming 
that the interaction does not affeot the commutation relations 
between the potentials and field quantities at a given time. The 
interaction causes the rj kU ’ s to cease to vary according to the simple 
harmonic law and the oscillators to cease to be harmonic. Thus it 
may affect the commutation relation between two potential or field 
quantities at two different times. 

We can now take over the interaction energy (61) into the quantum 
theory, putting p / for p to show it is a Heisenberg dynamical variable. 
Taking the atomic nucleus to be at the origin we get, by substituting 
(63) with x = 0 into (61), 

h qi = — f (P/> A w +X w ) d 3 k+ 
mc J 

+ 2^fj (Akz+Sw. Ak-j+X*,) d s kd 3 k' 

= me 2 (P*’ Ak<+A M )8fc 1 +^^ Y (Att+Sitf, A^+Xk^a^tff 1 
k ff 

if we pass from continuous to discrete k-values. Thus 

H Qt = — ^ ^l/(^kU+^kl/) 5 k 1 + 

+ 2^2 ^ (^kI/+^kk)(^kl7+-^kT/)(^ / ) 5 k l5 k ;1 » 

p u being the component of p, in the direction 1. With the help of (73) 
we may express H Q{ in terms of the and fj^, and we can then drop 
the s uffix t (whioh means going over to Sohrodinger dynamical 
variables), so that we obtain finally 

H q = ^Pl v i k ( 7 lkl+Vkl) 8 k k + 

+ (74) 

With the model of the atom we are using, the intention energy 
appears as a linear plus a quadratic function in the 17 ’erad rj’ s. The 
linear terms give rise to emission and absorption processes the 
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quadratic ones to scattering processes and processes in which two 
photons are absorbed or emitted simultaneously. The order of the 
factors ri and rj in the quadratic terms is not determined by the 
procedure of working from the classical theory, but this order is 
unimportant, since a change in it merely changes Hq by a constant. 

The matrix element of Hq referring to the emission of a photon 
into the discrete state kl, or into the discrete state p'l, as it may also 
be labelled, with the atom jumping from state a 0 to state a', is 

<pwia»i« , > - r 

since 8 h = «s p ft 8 . The p x occurring here, referring to the momentum 
of the electron, is, of course, quite distinct from the other letters p, 
referring to the momentum of the emitted photon. To avoid con- 
fusion we shall replace the electron momentum p by mx, these two 
dynamical variables being the same for the unperturbed atom. Pass- 
ing over to continuous photon states by means of the conjugate 
imaginary of equation (56), we get 

<pT*'|ff 0 |^ = j^<«'|i,l4>. (?«> 

Similarly, the matrix element of H Q referring to the absorption of a 
photon from the continuous state p°l with the atom jumping from 
state a 0 to state a' is 

VlH,|pW>-^j,VftW>. <’«> 

an d the matrix element referring to the scattering of a photon from 
the continuous state p°l° to the continuous state pT with the atom 
jumping from state a 0 to state a! is 

<p1V|fli,llrtW> = j^ i5 p i (lT)8^, ’ (77) 

* there being two terms in (74) which contribute to it. ^liese matrix 
elements will be used in the next section. The matrix elements 
referring to the simultaneous absorption or emission of two photons 
may be written down in the same way, but they lead to physical 
effects too small to be of practical importance. 

64« Emission, absorption, and scattering qf radiation 
> ^ We ft*" n# determine directly the coefficients of emission, absorp- 
tion, and scattering of radiation by substituting in the formulas of 
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Chapter VIII the values for the matrix elements given by (75), (76), 
and (77). 

For determining the emission probability we can ipse formula 
(56) of § 53. This shows that for an atom in a state at* the proba- 
bility per unit time per unit solid angle of its spontaneously emitting 
a photon and dropping to a state a' of lower energy is 

47 r 2 WP\e 1 . , | . | 0v | 2 /7Q v 

Now the energy and momentum of a photon of frequency v are 
W = hv, P = hv/c. 

Again, from the Heisenberg law (20) of § 29, 

<a'|£||a°> = — 27rtV(a°a , )<a'|£C 1 |a°), 

v(a°ac') being the frequency connected with transitions from state a 0 
to state a', which in the present case is just the frequency v of the 
omitted radiation. These results substituted in (78) make the emis- 
sion coefficient reduce to 

<^<a'|e* l | a «>|*. (70) 

To obtain the rate of emission of energy per unit solid angle for a 
specified .polarization, we must multiply this by hv. This gives for 
the total rate of emission of energy in all directions 

\ ^|<«',e X , «»>!*, ( 80 ) 

which is in agreement with expression (34) of § 45 and justifies Heisen- 
berg’s ass um ption for the interpretation of his matrix elements. 

In the same way the absorption coefficient, given by formula 
(59) of § 53, becomes for photons 

This absorption coefficient refers to an incident beam of one photon 
crossing unit area per unit time per unit energy range. If we take 
one per unit frequency range instead of energy range, as is usual 
when dealing with radiation, the absorption coefficient becomes 


This result is the same as (32) of § 45, if we substitute for the E v 
there the energy hv of a single photon. Thus the, elementary theory 
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of § 45, in which the radiation field is treated as an external perturba- 
tion, gives the correct value for the absorption coefficient . 

This agreement between the elementary theory and the present 
t theory coulct be inferred from general arguments. The two theories 
differ only in that the field quantities all commute with one another 
in the elementary theory and satisfy definite commutation relations 
in the present theory, and this difference becomes unimportant for 
strong fields. Thus the two theories must give the same absorption 
and emission when strong fields are concerned. Since both theories 
give the rate of absorption proportional to the intensity of the inci- 
dent beam, the agreement must hold also for weak fields in the case of 
absorption. In the same way the stimulated part of the emission in the 
present theory must agree with the emission in the elementary theory. 

Let us now consider scattering. The direct scattering coefficient is 
given by formula (38) of § 50. Such scattering of photons will not be 
accompanied by any change of state of the atom on account of the 
factor 8 a ^ in the expression for the matrix element (77). Thus the 
final energy W' of the photon will equal its initial energy W°. The 
scattering coefficient now reduces to 

eVm*c Mil 0 ) 2 . 

This is the same as that given by classical mechanics for the scattering 
of radiation by a free electron. We thus see that the direct scatter- 
ing of radiation by an electron in an atom is independent of the atom 
and is correctly given by the classical theory. This result, it should 
be remembered, holds only provided the wavelength of the radiation 
is large compared with the dimensions of the atom. 

The direct scattering is a mathematical concept and cannot be 
separatecj^out experimentally from the total scattering, given by 
formula (44) of § 51. Let us see what this total scattering is in the 
case Of photons. We must be careful in our application of formula 
*(44) off 51. The summation ^ in this formula may be considered as 

representing the contribution to the scattering of double transitions 
consisting of transitions firstly from the initial state to state k and 
secondly from state k to the final state. The first transition may be 
an absorption of the incident photon and the second an emission of 
the required scattered photon, but it is also passible for the first 
* transition to be the emission and the second the absorption. It is 
*$tear from the general nature of the method used for deriving formula 
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(44) of § 51 that both these kinds of double transitions must be in- 
cluded in the summation ^ when this formula is applied to photons, 

although only the first of them appears in the actual derivation given 
in § 51, as the possibility of the particle being created or annihilated 
was not taken into account there. 

We use zero, single prime, and double prime to refer to the initial, 
final, and intermediate states of the atom respectively, and zero and 
single prime to refer to the absorbed and emitted photons respec- 
tively. Then, for the double transition of absorption followed by 
emission, we must take for the matrix elements 
<*|F|pW>, <pV|F|*> 

of the formula (44) of § 51 

<&|F|p°a°> = (ol”\H q |p°l° a °>, <pV|F|*> = <pT a '|ff 0 |cO. 

Also E'-E k = Kv'+HpW-HpW) = h[v»-v{ «V)], 
where M«V) = H p (ol*)-H p (<x°). 

Similarly, for the double transition of emission followed by absorption 
we must take 


<jfc|F|pV> = <pT a "|ff G | a °>, <pV|F|&> = < a '|^ 0 |P°lV> 
and 

E'-E k = A-v°+fip(a°) — Hp(a! f ) — hv° — hv = -%'+v(o 0 )], 
there being now two photons, of frequencies v° and v', in existence 
for the intermediate state. Substituting in (44) of § 51 the values of 
the matrix elements given by (75), (76), and (77), we get for the 
scattering coefficient * 




Wc* v °| m 




+ 


<q , |3C l -|a*><a"'|iC 1 »|tn 0 > <a'|i 1 .|a*><ot*lz 1 -la°> 




v(aV) 


v'+v(«V) 


. ( 81 ) 


If we write (81) in terms of x instead of x, we get 

<a'|a,«|a , ><a*|ie l >|a (l » |* . 

v'+v(a'a«) || • VU) 

We can simplify (82) with the help of the quantum conditions. 
We have _ _ _ _ _ n 

Xj' Xy — U| 
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which gives 

2 K*' I 3 !' I “"><“* I 3 !* I“°> — W I®1» I 3 !' I “°» = 0, (83) 

or 

and also 

= l/m.(x r p P — PfXf) = iK/m.(l'l°), 

which gives 

X {< a, |*i'l a *> •«'(aV)<a ir |a:,t|a 0 >— v(a'a')<a'lz t ,la'> . <a"’|a: r |# # >} 

Multiplying (83) by v and adding to (84), we obtain 
|{< a '|* I >'><a'|^| a ®>[v'+v(aV)]-< a '^.|aO<«'K> 0 >K+*'(«'a')]} 

= fy 27 rw.(l'l°) 8 a . a .. 


If we substitute this expression for hj^-mn. (1'1°) 8^. in (82), we 
obtain, after a straightforward reduction mahing use of identical 
relations between the v’s, 


(2we)* 




H?{- 


>] 2 


(85) 


v°—v((x”(x 0 ) v'+v(aV>) 

This gives the scattering coefficient in the form of the effective 
area that a photon has to hit per unit solid angle of scattering. It is 
4 known as the Kramer s-Heisenberg dispersion formula , having been first 
obtained by these authors from analogies with the classical theory 
of dispersion. 

The fact that the various terms in (82) can be combined to give 
the result (85) justifies the assumption made in deriving formula (44) 
of § 51, that the matrix elements <pV|F|pV> of the interaction 
energy are of the second order of smallness compared with the 
<p V | V | Agones, at any rate when the scattered partioles are photons. 




An assembly of fermions 
An assembly of fermions can be treated by a method similar to 
jthat used in §§ 59 and 60 for bosons. With the kets (1) we may use 
*the aritisymmetrizing operator A defined by 


A = u'!-*2±P, (2') 

summed over all permutations P, the + or — sign being taken 
according to whether P is even or odd. Applied tQ the ket (1) it gives 
* J ±P|a?aSaS...aSi^ = (3') 

'a ket corresponding to a state for an assembly of u f fermions. The 
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ket (3') is normalized provided the individual fermionkets |a®>, la 6 ),... 
are all different, otherwise it is zero. In this respect the ket (3') is 
simpler than the ket (3). However, (3') is more complicated than (3) 
in that (3 ) depends on the order in which a °, a 6 , a®,... occur in it, 
being subject to a change of sign if an odd permutation is applied 
to this order. 

We can, as before, introduce the numbers n v n 2i w 8> ... of fermions 
in the states a w , ofl\ and treat them as dynamical variables or 
observables. They each have as eigenvalues only 0 and 1. They form 
a complete set of commuting observables for the assembly of fermions. 
The basic kets of a representation with the n ' s diagonal may be taken 
to be connected with the kets (3') by the equation 

A \oL a ot b of...aPy = ^z\ 7 i^n 2 n 2 ,.. > (6') 

corresponding to (6), the w/’s being connected with the variables 
a®, a 6 , a c ... by equation (4). The ± sign is needed in (6') since, for 
given n” s, the occupied states a°, a 6 , a®,... are fixed but not their 
order, so that the sign of the left-hand side of (6) is not fixed. To, 
set up a rule which determines the sign in (6'), we must arrange all 
the states a for a fermion arbitrarily in some standard order. The 
a’s occurring in the left-hand side of (6') form a certain selection from 
all the a’s and the standard order for all the a’s will give a standard ' 
order for this selection. We now make the rule that the + sign should 
occur in (6') if the a’s on the left-hand side can be brought into their 
standard order by an even permutation and the — sign if an odd 
permutation is required. Owing to the complexity of this rule, 
the representation with the basic kets \ , n! l w! 2 ri 2 „.y is not a very 
useful one. 

If the number of fermions in the assembly is variable, we can set 
up the complete set of kets # 

|>, |a°>, A\<x«o/>y, ^|a°a 6 a®>, ..., (9') ' 

corresponding to (9). A general ket is now expressible as a sum of^ 
the various kets (9'). > 

To continue with the development we introduce a set of linear 
operators 77, 77, one pair r} a , rj a corresponding to each fermion state a®, 
satisfying the commutation relations 

VaVb+VbVa = 0» 

VaVb+VbV 0 = 0, (A') ■ 

VaVb~^~VbVa ~ ^ab' , 



THEORY OF RADIATION 


$65 


250 

These relations are like (11) with a + sign instead of a — on the left- 
hand side. They show that, for a =£ b, rj a and rj a anticommute with 
rj b and rj b> while, putting b = a, they give 

Ha = Of = °> ’?a‘7a+ 1 ?a i ?a = f 11 ") 

To verify that the relations (11') are consistent, we note that linear 
operators rj, rj satisfying the conditions (11*) can be constructed in 
the following way. For each state a® we take a set of linear operators 
a va> °*a Hka the <j x , cr yi a z introduced in § 37 to describe the spin 
of an electron and such that o^, a ya , commute with o^, <r yb , && 
**for b =£ a. We also take an independent set of linear operators f tt , 
one for each state a®, which all anticommute with one another and 
have their squares unity, and commute with all the a variables. 
Then, putting 

Ha == Ka(^iM Ha 

we have all the conditions (11') satisfied. 

From (11*) 

* (Va Va)* = Va Va Va Va = Va( l ~ Va Va)Va = Va Va- 

This is an algebraic equation for i] a ij a , showing that i] a rj a is an 
observable with the eigenvalues 0 and 1. Also rj a fj a commutes with 
ij, rj b for b ^ a. These results allow us to put 

VaVa = n a> ( 12 ') 

the same as (12). From (IT) we get now 

’?<.’?« = !— »«» ( 13 ') 

the equation corresponding to (13). 

i Let us write the normalized ket which is an eigenket of all the n’s 
j belonging to the eigenvalues zero as Then 

»o >a = °. 

so from (12'^ <AVaVa>A = °- 

. Hence rj a = 0, (16') 

like (16). Again •' 

<AVaVa>A = Of 1 - n a)>A = <A>A ~ 1 ’ 
showing that r) a ) A is normalized, and 

n a Va>A == Va Va Vo)a “ Va^— n a)>A = Va>A> 
showing that i j 0 >^ is an eigenket of n a belonging to the eigenvalue 
qnity . It is an eigenket of the other n’s belonging to the eigenvalues 
zero, ‘.fence the other n’s commute with ij 0 . By generalizing the 
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> 

argument we see that rj a rj b rj c . t .rj g } A is normalized and is a simul- 
taneous eigenket of all the n’s, belonging to the eigenvalues uni ty 
for n a>, n b> n o an( l zero for the other rC s. This enables us to put 

A |of“a i o! c ...a e > = Va 7] b Vo-Vt>A> (17') 

both sides being antisymmetrioal in the labels a, 6, c,..., g. We have 
here the analogue of (17). 

If we pass over to a different set of basic kets for a fermion, 
we can introduce a new set of linear operators rj A corresponding to 
them. We then find, by the same argument as in the ewe of bosons, 
that the new rf s are connectecl with the original ones by (21). This 
shows that there is a procedure of second quantization for fermions* 
similar to that for bosons, with the only difference that the commu- 
tation relations (11') must be employed for fermions to replace the 
commutation relations (11) for bosons. 

A symmetrical linear operator U T of the form (22) can be expressed 
in terms of the tj, rj variables by a similar method to that used for 
bosons. Equation (24) still holds, and so does (25) with S replaced** 
by A. Instead of (26) we now have 

u tVxxVx,->a = 22(-Y~ 1 VaVZ 1 Vx 1 Vx,->A <a\U\x r > 

a r 

= | nal (26') 

rj^ 1 meaning that the factor rj Xr must be cancelled out, without its 
position among the other rj x ’ s being changed before the cancellation. 
Instead of (27) we have 

VbVxiVxt^^A == ^ ( ) r 1 Vx^ 7 lxi 7 lxt ,,, yA^bXr* ) 

r 

so (28) holds with } A for } 8 and thus (29) holds unchanged. We have 
the same final form (29) for U T in the fermion case as in the boson 
case. Similarly, a symmetrical linear operator V T of the form (30) can 
be expressed as V T = j^r, aVb <.ab\V\cd>ij d ij c , (36') 

the same as one of the ways of writing (35). 

The foregoing work shows that there is a deep-seated analogy 
between the theory of fermions and that of bosons, only slight 
ohanges having to be made in the general equations of the formalism 
when one passes from one to the other. 

ha. Tmta-rv 
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66. Relativistic treatment of a particle 

The theory we have been building up so far is essentially a non- 
relativistic one. We have been working all the time with one par- 
ticular Lorentz frame of reference and have set up the theory as an 
analogue of the classical non-relativistic dynamics. Let us now try 
to make the theory invariant under Lorentz transformations, so that 
it conforms to the special principle of relativity. 

In the first place we note that the general principle of superposi- 
tion of states, as given in Chapter I, is a relativistic principle. It 
applies to States’ with the relativistic space-time meaning. Beyond 
this, though, the theory doefc not lend itself very well to relativistic 
treatment, owing to the fundamental notion of an ‘observable’ not 
fitting in very well with the requirements of relativity. The measure- 
ment of an observable, in the theory we have been dealing with up 
to the present, has always consisted in the measurement of some 
dynamical variable at some instant of time in some Lorentz frame 
of reference and there does not seem to be any very natural way of 
generalizing this notion of an observable to make it cease to refer to 
a particular Lorentz frame. In consequence one cannot set up a 
scheme of relativistic quiSlum mechanics with the same degree of 
generality as the non-relativistic theory. All one can do is to solve 
special problems in a Lorentz-invariant way. This should not be 
regarded as a defect of the quantum theory, since it is in perfect 
analogy with the classical theory. Relativistic classical mechanics 
does not involve any such general scheme as the contact transforma- 
tion theory of non-relativistic classical mechanics, but consists in the 
. solution of comparatively special problems. 

One of the special problems that can be handled relativistically is 
that of the motion of a particle in an external field of force. Our non- 
relativistic quantum mechanics applied to this problem can be fitted 
in with the formalism of relativity by a change of notation. We put 
x lt x v x z for x, y, z and x 0 for ct, so that the time dependent wave 
function in Schrodinger’s representation appears as *l>(x 0 x 1 z t z z ) 9 
4n which the four a?’s may be treated on the same footing. We 
write the momentum components as p v p v p % instead of p x , p y t p g . 
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They satisfy 


Pr<l>> = ( r = 2 » 3 )- 


( 1 ) 


To preserve the symmetry between the four z'a we introduce a 
corresponding linear operator p 0 , equal to the energy divided by c, 
whose effect on 0 is o , * 


Poty = *g>. 


( 2 ) 


The difference in sign in (1) and (2) is required by relativity. 

We treat x 0 and p 0 as dynamical variables on the same footing as 
the other x ’ s and p’ s. They provide a new degree of freedom. The 
standard ket in (1) and (2) must refer to this new degree of freedom 
as well as to the previous ones. The lack of symmetry between the 
treatment of x 0 and that of the other x’s in the non-relativistic theory 
may be considered as due to our always using a representation with 
x 0 diagonal and leaving understood tht standard ket for the (x 0 p 0 ) 
degree of freedom. It would seem that only representations with x 0 
diagonal are useful in the non-relativistic theory. We may therefore 
expect that in a relativistic theory, which treats all the four z’b on 
the same footing, only representations with the four x’a diagonal will 
be useful. It then becomes convenient to leave understood the stan- 
dard ket for all four degrees of freedom and to write any ket as a 
wave function in the four z’b. 

In the theory of the electron that will be developed here we shall 
have to introduce some further degrees of freedom describing an 
internal motion of the eleotron. A ket for the whole system will now 
bq , wrjttgn m a ket in these further degrees of freedom and a wave 
function in the four x'a, and will appear as \x 0 x x x % £ 8 >, or |a?> for 
brevity, according to the notation explained near the end of § 20.* 


67. The wave equation for the electron 
Let us oonsider first the case of the motion of an electron in, the 
absence of an electromagnetic field, so that the problem is simply 
that of the free particle, as dealt with in § 30, with the possible 
addition of internal degrees of freedom. The relativistic Hamiltonian 
provided by classical mechanics for this system is given by equation 
(23) of § 30, and leads to the wave equation 

= 0 . . ( 3 ) 

where the jp’s are to be interpreted as operators in accordance with 
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equations (1) and (2). Equation (3), although it takes into account 
the relation between energy and momentum required by relativity, 
is yet unsatisfactory from the point of view of relativistic theory, 
because it is very unsymmetrical between p 0 and the other p’ s, so 
much so that one cannot generalize it in a relativistic way to the 
case when there is a field present. We must therefore look for a new 
wave equation. 

If we multiply the wave equation (3) on the left by the operator 
{p 0 +(w 2 c a +p?+p!+p|)*}> we obtain the equation 

{pl—m s c*— pi— pi— = 0, (4) 

which is of a relativistically invariant form and may therefore more 
conveniently be taken as the basis of a relativistic theory. Equation 
(4) is not completely equivalent to equation (3) since, although every 
solution of (3) is also a solution of (4), the converse is not true. Only 
those solutions of (4) belonging to positive values for p 0 are also 
solutions of (3). 

The wave equation (4) is not of the form required by the general 
laws of the quantum theory on account of its being quadratic in p 0 . 
In § 27 we deduced from quite general arguments that the wave 
equation must be linear in the operator d/dt or p 0 , like equation (7) 
of that section. We therefore seek a wave equation that is linear 
in p 0 and that is roughly equivalent to (4). In order that this wave 
equation shall transform in a simple way under a Lorentz transforma- 
tion, we ipy to arrange that it shall be rational and linear in p v p a , 
and p 8 as well as in p 0 , and thus of the form 

{Po+ a ii , i+ a a2 ? a+ a 8^8+^}l a; ) = W 

where the a’s and fi are independent of the p’s. Since we are consider- 
ing the case of no field, all points in space-time must be equivalent, 
so that the operator in the wave equation must not involve the x’b. 
Hip the a’s and ft must also be independent of the x% so that they 
m tlirfc commute with the p’s and the x’a. They therefore describe 
some new degrees of freedom, belonging to some internal motion in 
the electron. We shall see later that they bring in the spin of the 
electron. It is these degrees of freedom to which the ket \x > refers* 

Multiplying (5) by the operator a^Pi— 02 p a — a^pg— /J} on the 

left, we obtain’ 
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where refers to cyclic permutations of the suffixes 1, 2, 3. This is 
the same as (4) if the a’s and p satisfy the relations 
= 1 * = 

P* = ocxP+fax = 0 , 

together with the relations obtained from these by permuting the 
suffixes 1, 2, 3. If we write 

p = * m rnc, 

these relations may be summed up in the single one, 

<V <*„+«„ Up = 28^ (fi, v = 1, 2, 3, or m). (0) 

The four as all anticommute with one another and the square of 
each is unity. 

Thus by giving suitable properties tp the a’s and P we can make 
the wave equation (5) equivalent to (4), in so far as the motion of 
the electron as a whole is concerned. We may now assume (5) is the 
correct relativistic wave equation for the motion of an eleotron in 
the absence of a field. This gives rise to one difficulty, however, 
owing to the fact that (5), like (4), is not exactly equivalent to (3), 
but allows solutions corresponding to negative as well as positive 
values ofp 0 . The former do not, of course, correspond to any actually 
observable motion of an electron. For the present we shall consider 
only the positive-energy solutions and shall leave the discussion of 
the negative-energy ones to § 73. 

We can easily obtain a representation of the four a’s. They have 
similar algebraic properties to the a’s introduced in § 37, which a’s 
can be represented by matrices with two rows and columns. So long 
as we keep to matrices with two rows and columns we cannot get a 
representation of more than three anticommuting quantities, and we 
have to go to four rows and columns to get a representation of the 
four anticommuting a’s. It is convenient first to express the a’§ in 
terms of the a’s and also of a second similar set of three anticom- 
muting variables whose squares are unity, p v p a > Pi say* that are 
independent of and commute with the a’s. We may take, amongst 
other possibilities, 

®1 “ Pl a V “ Pl a i* if «8 = Pl a 8> ®m = P*> il) 

and the a’s will then satisfy all the relations (6), as may easily be 



256' RELATIVISTIC THEORY OF THE ELECTRON f 67 


verified. If we now take a representation with p a and o 8 diagonal, 
we shall get the following scheme of matrices: 


»i=/0 0 0\ 

1 1. 0* 0 01 
'10* 0 0 ll 
\o 0 1 0/ 

Pi = /o 0 1 0\ 

jo 0 0 1| 
10 0 0 
\0 1 0 0 / 


o t = 0-t 0 0\ 

i 0 0 o| 
lo 0 0-i 

\0 0 i 0/ 

Pa —/' 0 0 — i 0\ 

jo 0 0 — * j 
It 0 0 of 
\o i 0 0/ 


<r a = /I 0 0 0\ 

jo -1 0 0 1 . 
lo 0 1 of 
\0 0 0 —1/ 

p a = (1 0 0 0\*X 

- jo 1 0 o\ 

lo o-i or 

\0 0 0 -1/ 


Corresponding to the four rows and columns there are four indepen- 
dent kets, so that the wave function will have four components. 
We saw in § 37 that the spin of the electron requires the wave 
function to have two components. The fact that our present theory 
gives four is due to our wave equation (5) having twice as many 
| solutions as it ought to have, half of them corresponding to states 
| of negative energy. 

With the help of (7), the wave equation (5) may be written with 
three-dimensional vector notation 

{Po+Pi(«» V)+p i rnc}\x'> = 0. (8) 

To generalize this equation to the case when there is an electro- 
magnetic field present, we follow the classical rule of replacing p 0 and 
P by p 0 +e/c.A 0 *nd p+e/c. A, A 0 and A being the scalar and vector 
potentials of the fiel<| at the place where the electron is. This gives 
us the equation 

{2>o+^o+Pi(®. p +^ A )+^ mc jl a: > = °» (®) 

which is the fundamental wave equation of the relativistio theory of 
the electron. The conjugate imaginary equation is 

<a?||j>o+^o+Pi|®> P+“^J+/ , a mc j = 0 (10) 

in which the operators p operate^) the left. Airoperator of differen- 
tiation operating to the left musFbe interpreted according to (24) of 
* § 22 . 
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68. Invariance under a Lorentz transformation 

Before proceeding to discuss the physioal consequences of the wave 
equation (9) or (10), we shall first verify that our theory really is 
invariant under a Lorentz transformation, or, stated more accurately, 
that the physioal results the theoiy leads to are independent of the 
Lorentz frame of reference used. This is not by any means obvious 
from the form of the wave equation (9). We have to verify that, if 
we write down the wave equation in a different Lorentz frame, the 
solutions of the new wave equation may be put into one-one corre- 
spondence with those of the original one in such a way that corre- 
sponding solutions may be assumed to represent the same state. For 
either Lorentz frame, the square of the length of the ket \x) should 
give the probability per unit volume of the electron being at the plaoe 
x in that Lorentz frame. We may call this thejprobability density. Its 
values, calculated in different Lorentz frames for wave functions 
representing the same state, should be connected like the time com- 
ponents in these frames of some 4-vector. Further, the 4-dimensional 
divergence of this 4-vector should vanish, signifying conservation of 
the electron, or that the electron cannot appear or disappear in any 
volume without passing through the boundary. 

For discussing Lorentz transformations it is convenient to make 
the convention that terms containing a repeated suffix are to be 
summed over the values 0, 1, 2, 3 for that suffx. This enables us to 
write equation (9) in the form 


{“ M (^+ e / C -" 4 M)+ a£ »t mC }l a: > = °> (11) 

a 0 being equal to unity, and similarly we can write equation (10) in 
the form ^{ot^+e/c.Aj+^mc} « 0. (12) 

We now apply a Lorentz transformation and denote quantities 
referring to the new frame by a star. The components of the 4-vectors 
p and A will transform according to a linear law of the type 

iV = (13) 

Substituting these expressions for and in equations (11) and 
' (12), we obtain 

“d <a?|{oc ft a^ipi+e/c.A^+o^mc} = 0. / ' 

We now try to bring these equatiora back to the form of the original 
(11) and (12) by making a transfomiation 

l**> = r\ x > 


(15) 
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where y is a linear operator in the internal degrees of freedom and is 
independent of the x’a and p’a. The conjugate imaginary equation 


to (15) is 


<as*| = <*| y. 


(16) 


Equations (14) mil go over into the equations 

yKG>?+ e / c -4?)+“m mc}|a:*> = 0 | 

and <** l{“»'(2 J » + e /c • )+<*„, mc}y =0 / 

provided we can choose y such that 

V*vY = V V’ y°^y = “m* ( 18 ) 

These equations (17) are of the same form as (11) and (12), as re- 
quired, since one can divide out by the extra factors y and y. The 
transformation given by (15), (16), and (18) is something like a 
unitary transformation, but is more general since y does not satisfy 
the unitary condition. 

In order to verify that we can choose y to satisfy the equations 
(18), let us first take the special case when the change of our frame 
of reference consists simply of a rotation through a hyperbolic angle 
6 in the a^a^-plane, so that the transformation equations for the 
components of a 4-vector are of the type 


p Q = p* cosh 0+jP? sinh 0, 

p x = p* sinh 0+pf cosh 0, 

p» = p*. p» = pt 
% 


(19) 


The values of the a ftv may be written down at once from a comparison 
of these equations with (13). With these values for the it is easy 

to see that equations ( 18 ) hold when we take 


y = etf“i = y. (20) 

We have, in fact, 

y«oy = yy = e®“« 

= l+e ai +fl*af/2!+^fl|/3!+.... 

On account of a} = 1, this reduces to 

yot^y = {l+0 s /2!-f...}+a 1 {0+# 8 /3 !+...} 

= ooshtf+o^sinh# 

= aa ooshfl+aq sinh 0. 

4gain, yaqy = a qyy — ajSinhS+o^ooslffl. 

'Further, y = = e^e-^'aj = a,, 
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since a, anticommutes with o^, which results in c^/fo) = /(— oJa* 
for any Amotion /(cq) of a v Similarly, 

yoay = Os, yOmV = ««. 

Thus the five equations (18) hold with y given by (20) when the 
are given by (19). 

As a second typical change of the frame of reference, we may con- 
sider a rotation through an angle 6 in ordinary spaoe about the a^-axis. 
The transformation equations are now 

Po=P 0 . Pi = Pi, 

Pa = Pa cos 0+P* b * 31 
p 8 = — pjsin0+pjcos0. 

With the new values for the a^ v we can easily verify that equations 
(18) hold with 

y = g“ y ___ g— 

the analysis being very similar to the preceding case. 

If two changes of the frame of reference are made conseoutively, 
we simply have to multiply the corresponding y’s to get the y for 
the resultant change. Now any change of the frame of reference may 
be built up from two rotations of the types we have considered, and 
henoe there will always be a y satisfying (18). 

In this way we see that the solutions of the wave equatioiJF in the 
new frame of reference, equations (17)* can be put into a natural one- 
one correspondence with those of the original wave equations (11) 
and (12), corresponding solutions being connected by (15) and (10), 

and we may assume that corresponding solutions represent the same 
state. It remains for us to verify that the probability density trans- 
forms like the tima component of a 4-vector and that the divergence 
of this 4-vector vanishes. 

The probability density is <&|a;> = <*|ao|aO> since a, == 1. Let us 
see how the four quantities <*|oe M |a:>, with /t = 0, 1, 2, 3, transform 
under a Lorentz transformation. We have, from (15), (16), and (18), 

<**|«,|;r*> = (x\ym v y\x) = = <*KI*>V* 

C omparing this result with (13), we see that the four quantities ) 
<*|a /t |a;> transform like the oo variant components of a 4-vector (as 
defined in § 74). The contravariant components will be I 

<*|*>, ' (21) 
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This verifies that the probability density <x|x> is the time component 
\ of a 4-vector and that the corresponding space components are 
* --<a?|aj*> (with r = 1,2, 3). These space components multiplied by 
- the factor c give the probability current, or the probability of the 
; electron crossing unit area per unit time. 

The divergence of the 4-veotor is . 


2±J-<*h« l*>. (22) 

ft M 

where the ± sign means that the + sign is to be taken for /i = 0 
and the — sign for p = 1, 2, 3 before one does the summation. To 
prove this divergence vanishes, multiply equation (11) by <sc| on the 
left and (12) by |x> on the right and subtract. The result is 

the dots denoting that operates to the right on \x> in the first 
term and to the left on <a?| in the second. With the help of (1) and 
(2) and the interpretation (24) of § 22 for operators of differentiation 
operating to the left, this gives 






which just expresses the vanishing of (22). In this way we complete 
the proof that our theory gives, consistent results in whichever frame 
of reference it is applied. 


69. The motion of a free electron 
It is of interest to consider the motion of a free electron in the 
above theory according to the Heisenberg picture and to study the 
Heisenberg equations of motion. These equations of motion can be 
integrated exactly, as was first done by Schrodinger.t For brevity 
we shall omit the suffix t which the notation of § 28 requires to be 
inserted in dynamical variables that vary with time in the Heisen- 
berg picture. 

As Hamiltonian we must take the expression which we get as equal 
to cp 0 when we put the operator on |a?> in (8) equal to zero, i.e. 

B — —<>Pi(9,p)—p a mc* = —c(a,p)—p a mc*. 

t Sohrtidingsr, Sitzungeb. d. Berlin Akad, t 1930, p. 418. 


( 23 ) 
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We see at once that the momentum commutes with H and is thus a 
constant of the motion. Further, the ^-component of the velocity is 

«i = [* 1 ,#] = — e«i. (24) 

This result is rather surprising, as it means an altogether different 
relation between velocity and momentum from what one has in 
classical mechanics. It is connected, however, with the expressions 
(21) for the probability density and current. The x x given by (24) 
has as eigenvalues ±c, corresponding to the eigenvalues ±1 of <x v 
As x 2 and x 3 are similar, we can conclude that a measurement of a com-\ 
ponent of the velocity of a free electron is certain to lead to the result ±c. 
This conclusion is easily seen to hold also when there is a field present. 

Since electrons are observed in practice to have velocities con- 
siderably less than that of light, it would seem that we have here a 
contradiction with experiment. The contradiction is not real, though, 
since the theoretical velocity in the above conclusion is the velocity 
at one instant of time while observed velocities are always average 
velocities through appreciable time intervals. We shall find upon 
further examination of the equations of motion that the velocity is 
not at all constant, but oscillates rapidly about a mean value which 
agrees with the observed value. 

It may easily be verified that a measurement of a component of the 
velocity must lead to the result ±c in a relativistic theory, simply 
from an elementary application of the principle of uncertainty of 
§ 24. To measure the velocity we must measure the position at two 
slightly different times and then divide the change of position by the 
time interval. (It will not do to measure the momentum and apply 
a formula, as the ordinary connexion between velocity and momen- 
tum is not valid.) In order that our measured velocity may approxi- 
mate to the instantaneous velocity, the time interval between the 
two measurements of position must be very short and hence these 
measurements must be very accurate. The great aocuraoy with 
which the position of the electron is known during the time-interval 
must give rise, according to the principle of uncertainty, to an almost 
complete indeterminacy in its momentum. This means that almost 
all values of the momentum are equally probable, so that the momen- 
tum is almost certain to be infinite. An infinite value for a component 
of momentum corresponds to the value for the corresponding 
component of velooity. 
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Let us now examine how the velocity of the electron varies with 
*&ne. We have ^ 


Now since antioommutes with all the terms in H except — co^Pi, 
atiH+Hati = — ot 1 c<x 1 p 1 —CoL 1 p l ot 1 = — 2 cp lt 

and henoe 


(25) 


M&i = 2a 1 J5T+2cp 1 

= —2EoL 1 —2cp v 

Since H and p 1 are constants, it follows from the first of equations 


(25) that 


= 2 d^H. 


( 20 ) 


This differential equation in d x can be integrated immediately, the 
result being ^ = ^ g-tiinih, ( 27 ) 

where dj is a constant, equal to the value of d x when t = 0. The 
factor e~ 2iHt l h must be put to the right of the factor d? in (27) on 
account of the H occurring to the right of the d x in (26). The second 
of equations (25) leads in the same way to the result 

d, = e 2 «K/*dJ. 

We can now easily complete the integration of the equation of motion 
for x v From (27) and the first of equations (25) 

«i = H~\ (28) 

and hence the time-integral of equation (24) is 

• ^ =s (29) 

a ^ being a constant. 

From (28) we see that the x t component of velocity, —coc v consists 
of two parts, a constant part c*p l J5T~ x , connected with the momentum 
by the classical relativistic formula, and an oscillatory part 

whose frequency is high, being 227/A, which is at least 2mc z jh. Only 
the constant part would be observed in a practical measurement of 
velocity, such a measurement giving the average velocity through a 
time-interval much larger than A/2mc 8 . The oscillatory part secures 
that the instantaneous value of x x shall have the eigenvalues ±c. The 
oscillatory part of x l is small, being, according to (29), 

which is of the order of magnitude ftjmc , since (cq+cpjT/ -1 ) is of the 
order of magnitude unity. 
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70. Existence of the spin 

In § 67 we saw that the correct wave equation for the eleotron in 
the absence of an electromagnetic field, namely equation (5) or (8), is 
equivalent to the wave equation (4) whioh is suggested from analogy 
with the classical theory. This equivalence no longer holds when 
there is a field. The wave equation to be expected from analogy with 
the classical theory in this case is 

j(p 0 +^ 0 ) S -(p+U) 2 -m»c*j| a; > = 0, (30) 

in which the operator is just the classical relativistic Hamiltonian. 
If we must multiply (9) by some factor on the left to make it resemble 
(30) as closely as possible, namely the factor 

P+~a)— /) 8 ??ic, 

we get 

|(po+^oV-(«> P+ltf -mW+Pifya+lAoj^’ P+^ A )~ 

— = 0. (31) 

We now use the general formula that, if B and C are any two 
three-dimensional vectors that commute with a, 

(a, B)(a, C) = ^{oi^i C 1 +g 1 <t 2 B x C 2 +a 2 <r x B % Cj}, 

the summation referring to cyclic permutations of the suffixes 1, 2, 3, 

or («, B)(«, C) = (B, C )+i 2 a a (B t C t —B t C t ) 

128 

= (B,C)+t(a,BxC). (32) 

Taking B = C = p+e/c.A, we find, since 

(p+!a) x (p+?a) = ?{px A+A*p} 

= — tXe/e.ourl A = —ifie/c.#, 
where M is the magnetic field, that . 

(«, P+U)‘ = {p+- o \J+^(a, Jf). (33) 
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Also wo have 

(po+^o)(«.P+^ a ) - («. P+^ A )(p 0 +^o) 

= Uo,p 0 A— Ap 0 +A 0 p— pA 0 ) 

c 

Met iaA, , . \ .«e. 

= vr c -E +gradJ °) = 

where £ is the electric held. Thus (31) becomes 

((po+^ 0 ) a -( P +^A) l -m» C *-^(B, - 0. 

(34) 

This equation differs from (30) through having two extra terms in 
the operator. These extra terms involve some new physical effects, 
but since they are not real they do not lend themselves very directly 
to physical interpretation. 

To get to understanding of the physical features involved in the 
difference between (34) and (31) it is better to work with the Heisen- 
berg picture, this picture being always the more suitable one for 
comparisons between classical and quantum mechanics. The Heisen- 
berg equations of motion are determined by the Hamiltonian 

H = —eA„—c Pl {a, P+^a|— p a mc 2 , (35) 

the generalization of (23) to the case when there is a held. Equation 
(35) gives 

- (.,p+5a)’+»w 

= (p+* c AJ (36) 

with the help of (33). We have here the real part of the extra terms 
in (34) appearing without the pure imaginary part. For an electron 
moving slowly (i.e. with small momentum), we may expect the 
Heisenberg equations of motion to be determined by a Hamiltonian 
of the form mc l +H v where H x is small comparecTwith me K Putting 
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wic a +19i for H in (36) and neglecting H\ and other terms involving 
c~ a , we get, on dividing by 2 m 

Hi+eA ° = i( p+ ; A ) +^ (w ’ M) - (37) 

The Hamiltonian H x given by (37) is the same as the classical 
Hamiltonian for a slow electron, except for the last term 


2 me ' 


This term may be considered as an additional potential energy 
which a slow electron has in the quantum theory and may be 
interpreted as arising from the electron having a magntiic moment 
—fte/2mc.a. This magnetic moment is the one assumed 41 and 
47 for dealing with the Zeeman effect and is in agreement with 
experiment. 

The spin angular momentum does not give rise to any potential 
energy and therefore does not appear in the result of the preceding 
calculation. The simplest way of showing the existence of the spin 
angular momentum is to take the case of the motion of a free electron 
or an electron in a central field of force and determine the angular 
momentum integrals. This means working with the Hamiltonian (23), 
or with the Hamiltonian (36) with A = 0 and A 0 a function of the 
radius r, i.e. R = - eAo (r)-c Pl (a, p)- P> rm\ (38) 

and obtaining the Heisenberg equations of motion for the angular 
momentum. With either Hamiltonian we find for the rate of change 
of the ^-component of orbital angular momentum, m 1 = x 2 p 2 —x B p t , 
with the help of commutation relations proved in § 36, 

thihy = 

= -ep^m^o, p)-(«. pjmj 
= -c/>i(«»«hP-P»ni) 

= -tfe7>i{<T3P»-OsW- 

Thus ffh ¥* 0 and the orbital angular momentum is not a constant 
of the motion. This result is to be expected from the integrated 
equation of motion (29), the oscillatory part of the motion here dis- 
played giving rise to an oscillatory term in the angular momentum. 
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We have farther 

iftcr 1 = o 1 H — Ho 1 

= — c piK(o» p)— («. p)»J 

= — cpifaa— oo lt p) 

= -ZicpfaPt-aiPa} 
with the help of equations (51) of § 37. Hence 

a(mi+P<7i) = 0, 

so that the vector m + JRo is a constant of the motion. This result 
one can interpret by saying the electron has a spin angular momentum 
\fta> which must be added to the orbital angular momentum m before 
one gets a constant of the motion. T^he spin angular momentum 
could alternatively be obtained from the rotation operators for states 
of spin in accordance with the general method of § 35. 

The same vector o fixes the directions of both the spin magnetic 
moment and the spin angular momentum. If an electron in a certain 
state of spin has a spin angular momentum of \ft in a particular 
direction, it will have a magnetic moment —eh/ 2 me in the same 
direction. 

71. Transition to polar variables 

For the further study of the motion of an electron in a central field 
of force with the Hamiltonian (38), it is convenient to make a 
transformation to polar coordinates, as was ddne in § 38 in the 
non-relativistio case. We can introduce r and p r as before, but 
instead of k, the magnitude of the orbital angular momentum m, 
which is no longer a constant of the motion, we must now use the 
magnitude of the total angular momentum M = m+Jftor. Let us put 

iW = (39) 

The eigenvalues of m, are integral multiples of ft, those of \fta are 
and hence those of M s must be half-odd integral multiples of 
ft. It follows from the theory of § 36 that the eigenvalues of |j| must 
.be integers greater than zero. 

^9 If in formula (32) we take B = C = m, we get 
(o, m) 1 = m 2 +i(o, m x m) 

♦ =• m 8 — ft(o, m) 

= (m+Jflc) 2 — 2#(a, m)— JS*. 

Hence {(<*, m)+ft} 8 = 
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Thus (a, m)+X is a quantity whose square is M 2 +£# 2 and we could, 
consistently with equation (39), define jft as (o, m)+fc. This would 
not be the most convenient definition for j, however, since we would 
like to have j a constant of the motion and (cr, m)+ft is not constant. 
We have, in fact, from applications of (32), 

(a, m)(a, p) = i(a,mxp) 

and (a, p)(a, m) = p X m), 

so that 

(®> m)(«, p)+(o, p)(«, m) = 

.= i ^ a 1 .2ihp 1 = —2 h(a, p), 

or {(a, m)+8}(a, p)+(«, p){(o, m)+«} = 0. 

Thus (a, m)+h antioommutes with one of the terms in the expression 
(38) for H, namely the term —cp x [p, p), and commutes with the other 
two. It follows that p 3 {(a, oommutes with all the three terms 

in H and is a constant of the motion. But the square of /> 8 {(o, m)+h} 
is also M a +£& 2 . We can therefore take 

jK = p 3 {(o,m)+h}, (40) 

which gives us a convenient rational definition for j which is consis- 
tent with (39) and makes j a constant of the motion. The eigenvalues 
of thisj are all positive and negative integers, excluding zero. 

By a further application of (32), we get 

(c,x)(o,p) = (x,p)+i(o,m) 

= m+ip^jh—ih; (41) 

with the help of (40) and also of equation (68) of § 38. We introduce 
the linear operator e defined by 

re = p x (a, x). (42) 

Since r commutes with Pl and with (a, x), it must commute with c. 
We thus have 

f * €> = fri( a » x)]* = (a, x) 2 = x 2 = r 2 , 
or € 2 = 1. 

Now Pl (o, p) commutes with j , and since there is symmetry between 
x and p so far as angular momentum is concerned, p x (o, x) must also 
commute with j. Hence e commutes withj. Further, e must commute 
withp,, since we have 

(a, x)(x, p)~(x, p)(a, x) = (*, x(x, p)-(x, p)x) = t*(a, x), 
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which gives rerp r —rp r r€ =*" tfrc 

or r 2 cp r —r*p r € = 

From (41) and (42) we obtain 


repots, p) = rp f +t> 8 jfc — ift 
or Pi(o, p) = €(p r —ift/r) +%€p s jh/r. 

Thus (38) beoomes 

H/c = —e/c.A 0 —€(p r —ift/r)—i€p s jftjr—p z mc. 

This gives our Hamiltonian expressed in terms of polar variables. It 
should be notioed that c and p 8 commute with all the other variables 
occurring in H and anticommute with one another. This means that 
we can take a representation with p z diagonal in which c and p 3 are 
represented respectively by the matrices 



If r is also diagonal in the representation, the representative 
<r'pil> of a ket will have two components, <r', 1|> = 0 a (r') and 
<r', — 1|> = iff b (r') say, referring to the two rows and columns of the 
matrices (43). 


72. The fine-structure of the energy-levels of hydrogen 

We shall now take the case of the hydrogen atom, for which A 0 = e/r t 
and work out its energy-levels, given by the eigenvalues H' of H. 
The equation (H'—H) |2/'> = 0 which defines these eigenvalues, when 
written in terms of representatives in the representation discussed 
above with c and p s represented by the matrices (43), gives the 
equations 

(t + + ;)*» _ 7 ^ o_mc +> = °- 


- Kw>I>ut ^+WTc 

these equations reduce to 


= °i> 


ft 


mc-H'/c 


— ®*» 


(4 + ;K~(l +t rV = 0 * 


(44) 


> 

( 45 ) 

J 
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where a — e*/fic, which is a small number. We shall solve these equa- 
tions by a similar method to that used for equation (73) in § 39. 

Put t/f a = f“ 1 e“ r /°/ } = r-ie-W<*gr, (40) 


introducing two new functions, / and g f of r, where 
a = foa*)* = ^-F a /c 2 )-*. 
Equations (45) become 


(47) 


(48) 


We now try for a solution in which / and g are in the form of power 

V 9 = 20’*, (49) 

in which conseoutive^Values of 8 differ by unity though these values 
need not be integers. Substituting these expressions for / and g in 
(48) and picking out coefficients of r*- 1 , we obtain 


(l _ ; + r) ! ’ _ “• 



4 


(s+jK+Ca-l/a = 0 , 

— c a-ll a 2~\~ aC 8 -\-(8—j)C 8 — C 8 _Ja = 0 . 



By multiplying the first of these equations by a and the second 
by a 2 and adding, we eliminate both c 8 _ Y and c'_ 1? since from 
(47) aja x *= aja. We are left with 


[act +a 2 (8-j)]c § +[a 2 oL^a(8+j)Y B = 0, (51) 

a relation which shows the connexion between the primed and un- 
primed c’s. 

The boundary condition at r = 0 requires that nf; a and rty h -> 0 as 
r -> 0, so from (46) / and g 0 as r -> 0. Thus the series (49) must 
terminate on the side of small 8. If s 0 is the minimum value of 8 for 
whioh c 8 and c' do not both vanish, we obtain from (50), by putting 
8 = 8 0 and = 0, 


“0*- (*<>+.?>* = 0, 

. <<+(* 0 -jK = 0 , 

whioh give a* = — 6j+j*. 



Since the boundary condition requires that the minimum value of 8 
shall Be greater than zero, we must take 
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To investigate the convergence of the series (49) we shall determine 
the ratio for large a. Equation (51) and the second of equations 

(50) give approximately, when a is large, 
a t c 8 = ac'g 

and aCg == c 8 —]J & -h* u 2 . 

Hence c J c a-i = 2/cw. 

The series (49) will therefore converge like 


or e 2r/a . This result is similar to that obtained in § 39 and allows us 
to infer, as in § 39, that all values of H' are permissible for which a 
is pure imaginary, i.e. from (47), for which H ' > me 2 , while for 
H' < me 2 we take a to be positive and then find that only those 
values of H' are permissible for which the series (49) terminate on 
the side of large 8. 

If the series (49) terminate with the terms c B and c', so that 
c a+1 = c' +1 a* 0, we obtain from (60) with 8+1 substituted for 8 

ejai+e, ’Ja = 0, 

—Cal a 2—C 8 la = 0 . 

These two equations are equivalent on account of (47). When com- 
bined with (51), they give 

k '- ,( ) ojoa+a^a-j)] = a[a 2 a— a(«+j)], 

which Teduces to 2 — a(a 2 — aja 


(53) 


or 


a 2 \a t aj a eft 


with the help of (44). Squaring and using (47), we obtain 
a*(m*c*-H'*/c*) = <x 2 H' 2 /c 2 . 


rU 


'Henoe 






^The * here, which specifies the last term in the series, must be greater 
than 8 0 by some integer not less than zero. Calling this integer n, 
.we have 


and thus 


« = n+VO’*-* 1 ) 
*L.|l+ s? \* 

me* \ *-> 


(64) 




H r **« 
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This formula gives the discrete energy-levels of the hydrogen 
spectrum and was first obtained by Sommerfeld working with Bohr’s 
orbit theory. There are two quantum numbers n and j involved, but 
owing to a 8 being very small the energy depends almost entirely on 
n+\j\. Values of n and \j\ that give the same n+\j\ give rise to a 
set of energy-levels lying very close to one another, and to the 
energy-level given by the non-relativistic formula (80) of § 39 with 
8 = n+ |j|, apart from the constant term me 2 . 

We used equations (53) by combining them with (51), but this does 
not make full use of (53) since the coefficients of c a and c' 8 in (51) may 
both vanish. In this case we get, multiplying the first coefficient by 
a and the second by o a and adding 

(a 2 +al)<x—2aa 2 j = 0. 

With the help of (47) and (44) this gives 
fa+ajoc = 2 aj 


or 


or 


OL CL CL a 8 flj 


2 mca 

~~K~ 


2 me 

(m 2 c 2 — IT 2 /c 8 )* 


H'* a 2 

m 2 c 4 j 2 * 


Since H' must be positive, this leads to 


W V(j 8 -« 8 ) 

me 2 |j| * 


(55) 


which is the value of H' given by (54) when n = 0. The case n = 0 
thus needs further investigation to see whether the conditions (53) 
are then fulfilled. 

With » = 0, the maximum value of 8 is the same as the minimum, 
so equations (53) with 8 0 substituted for 8 should agree with (52). 
Now (56) gives, from (44) and (47), 

1 __ mcL V0 2 “" a *)\ 1 _ me a 

iji r 5"* i jr 


so the first of equations (53) with * 0 substituted for 8 gives 


cJlil+VO - *— “*)}+ c i.“ = °- 

This agrees with the second of equations (52) provided is negative. 
We oan conclude that, for n — 0,j must be a negative integer, while 
for the other values of n all non-zero integral values of; are allowed. 
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73. Theory of the positron 

It has been mentioned in § 67 that the wave equation for the elec- 
tron admits of twice as many solutions as it ought to, half of them 
referring to states with negative values for the kinetio energy cp 0 +eA 0 . 
This difficulty was introduced as soon as we passed from equation (3) 
to equation (4) and is inherent in any relativistic theory. It occurs 
also in classical relativistic theory, but is not then serious since, owing 
to the continuity in the variation of all classical dynamical variables, 
if the kinetic energy cp 0 +e^4 0 is initially positive (when it must be 
greater than or equal to me 2 ), it cannot subsequently be negative 
(when it would have to be less than or equal to —me 2 ). In the 
quantum theory, however, discontinuous transitions may take place, 
so that if the electron is initially in a state of positive kinetio energy 
it may make a transition to a state of negative kinetic energy. It is 
therefore no longer permissible simply to ignore the negative-energy 
states, as one oan do in the classical theory. 

Let us examine the negative-energy solutions of the equation 


l^o + ”^oj + -^ij - 


+“»(3>»+^*j+a8(p*+^sj+«m«»c]l*> = 0 (66) 

a little more closely. For this purpose it is convenient to use a repre- 
sentation of the a’s in which all the elements of the matrices repre- 
senting ot v aj, and 03 are real and all those of the matrix representing 
are pure imaginary. Such a representation may be obtained, for 
instance, from that of § 07 by interchanging the expressions for a a 
and atm in (7). If equation (56) is expressed as a matrix equation in 
this representation and we put —i for i in all the matrix elements, 
we get, remembering (1) and (2), the matrix form of the equation 

+0 *^— p 8 +-4 8 j— a^mcjlo:*) = 0, (57) 

where |a?*> is the ket whose representative is the conjugate complex 
of the representative of |#>. Thus each solution \x} of (56) deter- 
mines uniquely a solution \x*) of (57) with the conjugate complex 
representative. Further, if the solution \x > of (56) belongs to a 
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negative value for cp 0 +eA 0 , the corresponding solution |a?*> of (57) 
will belong to a positive value for cp 0 — eA 0 , But equation (57) is just 
what one would get if one substituted — e for e in (56). It follows 
that each, negative-energy solution of (56) corresponds to a positive- 
energy solution of the wave equation obtained from (56) by substitu- 
tion of — e for c, which solution represents an electron of charge +e 
(instead of — e, as we had up to the present) moving through the 
given electromagnetic held. Thus the unwanted solutions of (56) are 
connected with the motion of an eleotron with a charge +e. (It is 
not possible, of course, with an arbitrary electromagnetic field, to 
separate the solutions of (56) definitely into those referring to positive 
and those referring to negative values for cp 0 -j-eA 0 , as such a 
separation would imply that transitions from one kind to the other 
do not occur. The preceding discussion is therefore only a rough 
one, applying to the case when such a separation is approximately 
possible.) 

In this way we are led to infer that the negative-energy solutions 
of (56) refer to the motion of a new kind of particle having the mass 
of an eleotron and the opposite charge. Such particles have been 
observed experimentally and are called positrons . We cannot, how- 
ever, simply assert that the negative-energy solutions represent posi- 
trons, as this would make the dynamical relations all wrong. For 
instance, it is certainly not true that a positron has a negative kinetio 
energy. We must therefore establish the theory of the positrons on 
a somewhat different footing. We assume that nearly all the negative- 
energy states are occupied , with one electron in each state in accordance 
with the exclusion principle of Pauli. An unoccupied negative-energy 
state will now appear as something with a positive energy, since to - 
make it disappear, i.e. to fill it up, we should have to add to it an 
electron with negative energy. We assume that these unoccupied 
negative-energy states are the positrons . 

These assumptions require there to be a distribution of electrons 
of infini te density everywhere in the world. A perfeot vacuum is a 
region where all the states of positive energy are unoccupied and all 
those of negative energy are occupied. In a perfect vacuum Maxwell’s 

equation divtf = 0 

must, of course, be valid. This means that the infinite distribution 
of negative-energy electrons does not contribute to the electric field. 

.95**17 T 
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Only departures from the distribution in a vacuum will contribute, 
to the electric density p in Maxwell’s equation 

div£ = ±np. 

Thus there will be a contribution — e for each occupied state of posi- 
tive energy and a contribution +e for each unoccupied state of 
negative energy. 

The exclusion principle will operate to prevent a positive-energy 
electron ordinarily from making transitions to states of negative 
energy. It will still be possible, however, for such an electron to 
drop into an unoccupied state of negative energy. In this case we 
should have an eleotron and positron disappearing simultaneously, 
their energy being emitted in the form of radiation. The converse 
process would consist in the creation of an electron and a positron 
from electromagnetic radiation. 

The theory of the positron here given appears at first sight to treat 
the electrons and positrons on very different footings, but actually 
the fundamental ideas of the theory are symmetrical between the 
electrons and positrons. We should have an equivalent theory if we 
supposed the positrons to be the basic particles, described by wave 
equations of the form (9) with — e for e, and then supposed that nearly 
all the states of negative energy for the positron are filled up, a hole 
in ^he distribution of negative-energy positrons being then inter- 
preted as an ordinary eleotron. The theory could be developed 
consistently with the hypothesis that all the laws of physics are 
symmetrical between positive and negative electric charge. 



XII 

QUANTUM ELECTRODYNAMICS 
74. Relativistic notation 

In § 63 a theory was given of the interaction of an atom with a field 
of radiation. This theory was an approximate one, valid for radiation 
of long wave-length and for a certain simplified model of the atom. 
Our present problem is to improve this theory, and in particular to 
make it relativistic, so that it may be applied to particles moving at 
high speed. We must first set up a notation suitable for handling the 
relativistic equations with which we shall have to deal. 

We choose units of space and time which make the velocity of light 
unity, so that c will no longer appear in our equations. A point in 
space-time is located by its three Cartesian coordinates x v x 2 , x 3 and 
its time t = x 0 , which together form a 4-vector x^ (p = 0, 1, 2, 3), or 
x as we may write it in vector notation. Two 4-veotors a and b have 
a Lorentz-invariant scalar product (ab) given by ^ 

(ab) = a 3 b 3 — — o 3 b 3 — == a o^o — (®b), (1) 

(ab) being the three-dimensional scalar product of the three-dimen- 
sional parts of a and b. To take into account the — signs in (1), it 
is convenient to introduce vector components with raised suffixes, 
defined by 

a 0 = a 0 , a 1 = —a v a 2 = — a a , o 3 = — a 8 , (2) 

so that the scalar product (ab) may be written 

(ab) = atop = (3) 

a summation being implied over a repeated (letter) suffix in a term. 
The components aP are called the covariant components of the 4-vector 
a, the original components a^ which transform like the four coordi- 
nates Xp of a point in space-time, being called the contrava riant 
components. 

The fundamental tensor g^ v is defined by 

000 = L 011 = 0M = 083 = —L 
g^ v = 0 for p 76 v. 

With its help we can write the rule (2) connecting the covariant 
and oontravariant components of a vector 
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and we oan write the scalar product (ab) as 

(ab) = g^aW. 

The operators d/dx^ form the covariant components of a 4-veotor, 
and the oontravariant components of the vector are written d/dx*. 
Equations (1) and (2) of § 66 may be written ^ \ 

d d ^ 

p * = i % 01 p » = iK e^ W 

and show how the momentum-energy 4- vector of a particle is related 
to the operator of differentiation applied to the wave function. 

The function S(xx) is evidently Lorentz invariant. It vanishes 
everywhere except on the light-cone with the origin as vertex, i.e. the 
three-dimensional, space (xx) = 0? This light-cone consists of two 
distinct parts, a future part , for which x n > 0, and a past part, for which 
*o<OfP* function which equals 8(xx) on the future part of the 
light-cone and — $(xx) on the past part of .the light-cone is also 
Lorentz invariant. This function, which equals 8(xx)# 0 /|a? 0 |, plays 
an important role in the dynamical theory of fields, so we introduce a 
special notation for it. - We define 

^ r * . * A(x) = 28(xxK/W- (6) 

This definition gives a meaning to the function A applied to any 
4-v^fetor. With the help of (1) and of (9) of § 15, we can express 
d(x^in the form 

t(V$**<V$}a<xx) = i|x|-HS(x 0 - |x|)+8(* 0 + |x|)}, (7) 

|x| being the length of the three-dimensional part of x, and then 
A(x) takes the form 

A(x) = |x|- 1 {8(a; 0 -]x|)-S(a; 0 4-|x|)}. (8) 

A(x) is defined to have the value zero at the origin, and evidently 
A(-x) = — A(x). 

Let us make a Fourier analysis of A(x). Using d*x to denote 
dtt x dx t dx a and d*x to denote dx t <&Cg we have, for any 4-veotor k, 

J A(x)e<to> d*x = J ixl-^Xj-lxD-Stxo+lxDJe^^-wi^x 


= J Ixl-^e"^ — g-ifc.NJg-iCkx) 

By^introducing polar coordinates |x|, 6 t <f> in the three-dimensional 

1 fi t 1 *' % C. * ’Xq) 
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%iX t z 3 space, with the direction of the three-dimensional part of k as 
pole, we get 

J A(x)e«**> d*x = jjj _ e -<*.!*l}fi-<ikil*|oo« « | x |ain 6 dM<f>d\x \ 

» rt 

= 2nJ {e tt *W— e- i *»W}(J|x| j e-**iw° 0 * # |x|8infl<W 

GO 

= 27n'|k|- 1 J {e tt »W— e^'WJdlxKe-'iwW— e*WW} 
o 

CO 

= 2iri|k| -1 f {e«*.-iMX*_e«*.+i'ki x>}da 


= 4^|k|-m-|k|)-8(* 0 +|k|)} - J9) I 

= 4w*»A(k). (9) 


Thus the Fourier analysis gives the same function again' with the 
coefficient 4nH. Interchanging k and x in (9), we get A'f.j- - 1 '- ■ 

A(x) = -if. 4w*. J A(k)e«k*) d* k. (1J>) 


Some of the important properties of A(x) can easily be deduced 
from its Fourier resolution. In the first plaoe equation (lO) shows tha t 
A(x) can be resolved into waves all travelling with the velocity of 
light. To get an equation for this result we apply the operatorfel to ’ 
both sides of (10), thus Q- jjk. ' 

□A(x) = — i/4w*. J A(k)De«k*>d«k = i/tor*. j (kk)A(k)e«**> d*k. ’ 


Now (kk)A(k) = 0, and hence 


□A(x) = 0. ( 11 ) 

This equation holds throughout space-time. We can give a meaning 
to DA(x) at a point where A(x) is singular by taking the integral 
of DA(x) over a small four-dimensional space surrounding the point 
and transforming it to a three-dimensional surfaoe integral by Gauss’s 
theorem. Equation (11) informs us that the three-dimensional surfaoe 
integral always vanishes. 

The function A(x) vanishes all over the three-dimensional surfaoe 
*o = 0. Let us determine the value of 8A(x)/8x 0 on this surfaoe. It 
evidently vanishes everywhere except at the point = x t = Xf = 0, 
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where it has a singularity whioh can be evaluated as follows. Differ- 
entiating both sides of (10) with respect to x 0 , we get 

8A(x)/8x 0 = l/4w*. J A 0 A(k)e<«“)#k 

= 1/^*. J 1 k I _1 {8(*o— I k | ) — S(& 0 + 1 k | d 4 k 

= j {S(* 0 — | k | ) -h S (* 0 + 1 k J )>e«^*> d«k. 

Putting x 0 = 0 on both sides here, we get 

[0A(x)/ar o ]*,_ o = l/47r*. J {S^o-IkD+S^o+IkDK^d^k ? 
= 1/2*8. j e -«k*)rfsk 

= 4ir 8(x 1 )S{x t )8(x a ). (12) 

Thus the ordinary 8 singularity, with the coefficient 4n, appears at 
the point x 1 = z 2 = x 3 = 0. 

75. The quantum conditions for the field 

In § 63 a theory of a field of radiation without interaction with 
matter was first developed and the interaction was taken into account 
subsequently. In the theory without interaction dynamical variables 
were introduced to describe the field, commutation relations were 
established for these dynamical variables, and a Hamiltonian was set 
dp winch made the dynamical variables vary correctly with the time. 
No approximations were made in this work. The theory would there- 
fore be a quite satisfactory, exact theory of radiation without inter- 
| action with matter, were it *iot for one feature in it, namely our 
talking the scalar potential to be zero at the outset. This feature 
spoils the relativistic form of the theory and makes it unsuitajjje as 
a starting-point from which to develop an accurate theory of radiation 
in interaction with matter. We shall here consider how to put the 
theory of radiation without interaction with matter into relativistic 
form. . 

-We leave the scalar potential A 0 arbitrary and it then forms, 
together with the vector potential A v A ti A s , a 4-vector A^. The 
Maxwell equations (62) of § 63 must then be generalized to 

0 04^ = 0, dAyJdXp = 0. (13) 

For the present we shall ignore the second of these equations and 
work only from the first. This equation shows that' each A^ can be 
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resolved into waves travelling with the velocity of light, so that its _ 
Fourier resolution is of the form r 

r 

A^x) = 2 J 8(kk)4 k/1 e«« d* k, (14) ^ 

x denoting a general point in space-time. The factor 8(kk) here 
ensures that the integrand vanishes except for those values of the 
4-vector k which satisfy (kk) = 0, and the coefficient A kfl may be 
considered as undefined except when (kk) = 0. Since A^x) is real, 
we must have A_ kfA = A kfl , so (14) may also be written 

A^x) = 2 j d*k. (15) 

*C>0 

With the help of formula (7) applied to k, this goes over into 
AJx) = f |k|-i S(i 0 - IkD^^e^+J^e-^)} d«k 

Jfco>0 

= j{A kll e«*> + A kii e-«*>}k^d* k, (16) 

where it is implied in the last integrand that k 0 = |k|, i.e. that k is 
a 4-vector lying on the future part of the light-cone. - ,?s ~ 

Equation (16) is usually the most convenient form in which to give . > 
the Fourier resolution of A^. For fi = 1, 2, 3 it agrees with (63) of 
§ 63, except for the factor k^ 1 in (16). This factor is a desirable one 
to have in a relativistic theory, since, the product k^d 3 k gives la 
Loren tz invariant element on the light-cone (kk) = 0. The Lorentz 
invariance can be proved by direct geometrical methods, and can 
also be inferred from the above analysis, it being evident that the 
coefficient A kfX introduced by (14) is a 4-vector for each value of k 
on i^ie light-cone, so that the factor {} in (16) is also a 4-vector, and 
hence the remaining faotors on the right-hand side of (16) must form 
a four-dimensional scalar. 

The quantities A^ and dA^dx^ for all x v x v x z at a given time 
x 0 = t are sufficient, with the help of the first of equations (13), to 
determine the potentials throughout space-time, so these quantities 
may be considered as the dynamical variables describing the field of 
radiation considered as a dynamical system. (They are the ordinary 
dynamical variables of the classical theory, or the Heisenberg dynami- 
cal variables of the quantum theory.) Define the quantities A k{ u for 

*• > 0 by A klit = A^e***. (17) 
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A^x) = j 

dA^x)/ 8x 0 = i j {i M e-«-I M c«} <Pk, 

k 0 being understood as equal to |k| in the integrands here. These 
equations express and dA^/dxQ at time t as functions of A^ and 
A k 0 not involving t explicitly. By reversing the three-dimensional 
Fourier analysis of equations (18) we can get A^ and A^ as 
functions of A^ and dA^/dxQ at time t not involving t explicitly. 
Hence we may take A^ and A kfii for all ft and all k with k 0 > 0 as 
the dynamical variables describing the system, instead of A^ and 
BAJSxq at time t. 

We must now determine the quantum conditions for the A^ and 
A^. In the first place, variables referring to different values of k or 
of ft belong to different degrees of freedom and therefore commute. 
We can get information about the quantum conditions, for variables 
referring to the same value of k and ft from the work of § 63. To 
connect up with this work, we pass over to discrete k-values in three- 
dimensional k-space. Equation (73) of § 63 gives, on taking into 
account that the present A k variables are k 0 times those of § 63, 

2wA kU = (19) 

Let us consider one particular discrete k-value for which k v = k 2 = 0, 
k 3 = k 0 > 0. Then the polarization variable 1 can take on two values 
referring to the two directions 1 and 2, so equation (19) gives, with 
the help of the commutation relations for the if s and vf s, equations 

(11) of § 60, A^ttA^ — A^ifA*. m = ^A; 0 s k /47r a , 

A k gA k 21 — AtflAk# = hk 0 5^/ 4tt 2 . 

With the help of (17), these equations may be written in terms of the 
Atp, Aty for k 0 >0 

AjaA^—A^A^ = hk 0 5 k /47r a , l 

A^lA^—A^Ai^ = Kkft 8yjAj1^ . / 

The work of $ 63 gives us no information about A k a and 
* However, we oan now obtain the quantum conditions for A& and 
from the theory of relativity. Equations (21) have to be built 
up into a relativistio set of equations and the only simple way of doing 
so is by adding to them the two further equations 


} ( 20 ) 


A^AjgQ — AftfAtfl = — ftk 3 8^ 


) ( 22 ) 
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Note the opposite sign in the last of these equations. The four equa- 
tions (21) and (22), together with the conditions that A^ and A kv 
commute for fi ^4 v, can then be written as a single tensor equation 

== • (23) 

We get in this way the quantum conditions for all the dynamical 
variables. Equation (23) can be extended to 

AjzpAjgfy ^kV ^kja = fffivl ^ (24) 

Let us now return to continuous k- values. To convert 8 tt r to con- 
tinuous k-values we note that, for a general function /( k) in three- 
dimensional k -space, 

|/(k) 8 kk . = /( k') = J /( k) 8 s (k— k') d*k, (25) 

where 8 3 (k— k') is the three-dimensional 8 function 
S 8 (k-k') = Sfa-WSfa-fySfa-k',). 

In order that (25) may conform to the standard formula connecting 
sums and integrals, equation (52) of § 62, we must have 
*k^kk' = 8 a (k — k'). 

Thus (24) goes over to 

Ak’pAjtp = g^y/ifrr^.hk 3 8 3 (k — k ; ). 

This equation, together with the equations 

A^p A k f y A k ’ v A^ = 0 , 

' , = 
provide the quantum conditions for the field quantities in the theory 
with continuous k-values. We have here the formalism which must 
be used instead of (11) of § 60 for dealing with a set of oscillators 
whose number is a continuous infinity, equal to the number of points 
in a volume. The number of degrees of freedom of the system is a 
continuous infinity, and the 8 function appears in the commutation 
relations instead of the two-suffix 8 symbol. 

The quantum conditions for the field may also be expressed in 
terms of the potentials A^(x) at different points x in spaoe-time. 
We have from (16), (27), and (28) 

[A^AJx')] 

= jj [A^e<^+I^e^ l A Vv e^+I Vv e^^]ko%- 1 ^k^k > 

= igjtn*. jj { e -<te) e «kv>_ e <<krt e -<<kv>} 8,(k— kX -1 d*kd*k' 

= ignitor*, j { e -«k..-xl_ e «k.*-x0}i 5 -i ^k. (29) 


(26) 

(27) 

} (28) 
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This three-dimensional k-integral is easily seen to be equal to the 
four-dimensional k-integral over the whole of four-dimensional 
k-spaoe 

igjtn*. J |k|-H8(fc o -|k|)-S(& o +|k|)}c-«fc»-*0^k 

= ig^K j A(k)e-«k-*-^d 4 k. 

Evaluating this integral with the help of formula (10), we get finally 
K,(x), ^v(x')] = 9p.v A(x-x'). (30) 

We see that potentials at two points in space-time always commute 
unless the line joining the two points is a null-line (i.e. the traok of 
a light-ray). 

Let us determine the quantum conditions for the quantities and 
dA^/dx^ for various x v x 2 , x z at a given time x 0 = t. Using the suffix 
t to denote a quantity taken at the time x 0 = t, we have, putting 
*0 = = * in (30), [A^x), A^x')] = 0 (31) 

Differentiating (30) with respect to x 0 and then putting x 0 = x f 0 = t, 
we get 


ra,- < S2 > 

from (12). Finally, differentiating (30) with respect to x 0 and x' 0 and 
then putting x 0 — x' 0 = t, we get 




Ll it 

l Sx'o hi 


since 0*A(x)/0ag = 0 for x 0 = 0. We can, as stated on p. 279, take 
the quantities A^x) and {dA fi (x)/8x 0 } t as the dynamical variables 
describing the system, and equations (31), (32), and (33) are then the 
quantum conditions for these dynamical variables. From the form 
'of these quantum conditions we see that, apart from numerical 
coefficients, the A^fx )’ s can be looked upon as a set of coordinates 
q and the {&4 M (x)/&c 0 }/s as their conjugate momenta p, there being 
a $ function on the right-hand side of (32) instead of a two-suffix 8 
symbol on account of the number of these g’s and p ’ s being a con- 
tinuous infinity. The quantum conditions (31), (32), (33) still hold 
if the radiation is in interaction with matter, and indeed in all Lorentz 
frames of reference, but the more general condition (30) need not then 
hold, since the commutation relations connecting dynamioal vari- 
ables at different times in general get altered by interaction. 
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The electric and magnetio fields £ and M form in relativistic 
notation a 6-veotor F^ v = — 

&i — -^io> ^2 ~ ^io> <^8 = ^80» 

= F 33i = F 13i 3 = F n . (34) 

The equations connecting £ and M with the potentials may be 
written in tensor form 

to A to A 

(35) 


El 8-Ay dAp 

** ~~ dx* dx* ‘ 


The quantum conditions connecting £ and M at different points in 
space-time can be obtained immediately from (35) and (30). 

76. The Hamiltonian for the field 

The Hamiltonian for the field, H R say, must be chosen so as to 
give the correct Heisenberg equations of motion for the dynamical 
variables. This suffices to fix it, except for an arbitrary constant. 
Prom (17), the dynamical variables A k ^ vary with t or x 0 according 

to the law dA ky4 jdt = ik 0 A ktlt . 

Thus from the Heisenberg equations of motion 
%h dA kflt /dt = A k(jU H B —H R A kfjU 

we get — = A k(lt H R —H R A ktU . (30) 

We must choose Hr to satisfy these conditions. 

Let us pass over to discrete k-values and consider again one 
particular k-value for which k x = k 2 = 0, k 3 = k 0 > 0. We then 
have the commutation relations (20), which show us that, so far as 
concerns the degrees of freedom A kli and A k2i , H R must consist of 
the terms (37) 

as these terms substituted for H R in the right-hand side of (30) make 
it equal the left-hand side. These terms are in agreement with (72) 
of § 03, if one takes into account that the A k s there differ from the 
present ones by the factor k 0 . For the degrees of freedom A k3t and 
A k0t we have, from (22), the commutation relations 

‘^k8T < ‘^k8(^kS == ^o^k/4^ > 

= hk 0 8j±n*, 

which show similarly that H R contains the terms 

^{^kaf^kSf^-^kW^kW^k 1 * 


(38) 



284 QUANTUM ELECTRODYNAMICS $ 76 

It is convenient to change this by a constant and to take instead the 

k 8 f ^kOf A fcof) 6 k S (39) 

as it will be found later that (39) gives no zero-point energy to H R , 
The total Hamiltonian is now 

Hr = 4tt 8 ^ (^kl'^kl“l“'^k2-^kaH“^k8^'k8 — -^kO^ko)*?* (^0) 

= 47T 8 J (^kl^kl+^k8^k8+^k8^k8 — -^kO*^ko) (^1) 

if we pass back to continuous k-values. This H R gives, according to 
^(17) of § 29* eiBM m Akll e-*B*M = A k(U = A kll e<*>‘. (42) 

=* We may call longit udinal degrees of fieedom^ the degrees of free- 
dom associated with the variables A k0i and A ka/ for the particular 
k-value considered above, in contradistinction to the ‘transverse 
degree s of freedom* associated with the variables A kU and A k ^TFor 
a general k-value ^4** is to be replaced by A ktd , x being a unit three- 
dimensional vector in the direction of the three-dimensional part of k. 
The longitudinal degrees of freedom do not occur in the theory of 
§ 63, A k0 and A kK there being zero. The present Hamiltonian (40) 
differs from the Hamiltonian (72) of § 63 by the terms referring to 
the longitudinal degrees of freedom, these terms being needed now 
to make A kot and A kld vary correctly with t. 

We see from (39) that the contribution of the degree of freedom 
A k0t to the Hamiltonian is negative. This means that the dynamical 
system formed by the variables A k0ii A kot is a Jiarmonic oscillator of 
negative ener gy. It is rather surprising that such an unphysical idea 
as negative energy should appear in the theory in this way. The 
negative energy is a necessary consequence of the — sign on the 
right-hand side of the second of equations (38) and this — sign is 
demanded by relativity. We shall eee in the next section that the 
negative energy associated with the degree of freedom A kot is always 
compensated by the positive energy associated with the correspond- 
ing longitudinal degree of freedom A kuti so that it never shows up in 
practice. 

The theory of a harmonic oscillator of negative energy may be 
built up in the same way as that of an ordinary harmonic oscillator 
given in § 34. Expressing the A kot of the second of equations (38) in 
terms of 17 by means of 


Sbr-Jk# = 4 V* 
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we have 77 satisfying the same commutation relation with 77 as in 
§ 34, and the energy in this degree of freedom is —hk Q 7777 , from (39). 
The work of § 34 now shows that the maximum eigenvalue of the 
energy is zero, the other eigenvalues being negative integral multiples 
of ftk 0 . Introducing the normalized eigenket of the energy belonging 
to the eigenvalue zero as the standard ket [ 0 >, we have 

77 | 0 > = 0 

as in § 34, and T 7 n | 0 > with n a positive integer is the ket corresponding 
to the nth quantum state, which has the energy — nftk 0 . Any ket can 
be expressed as a power series in 77 multiplied into | 0 >. 

For the whole field of radiation we can introduce a standard ket } F 
for which there is zero energy in each degree of freedom. Any state 
of the field of radiation then corresponds to a ket of the form of a 
power series in the various 77 -variables multiplied into > jP . We can 
replace the power series in the 77 -variables by a power series in the 
Fourier coefficients A kV A ka , A kSi A k0 . The different terms in thej 
power series correspond to different degrees of excitation of the various j 
Fourier components of the field. Alternatively, they correspond' 
to different numbers of photons present in the various stationary 
states of a photon, there being now longitudinal photons associated 
with the longitudinal degrees of freedom, as well as the usual trans- 
verse ones. (The physical significance of the longitudinal photons 
will become clear later, see p. 305.) If we are working with continuous 
k -values, the power series in A kl , A k2) A k3 , A k0 becomes a sum of 
integrals of degree 0 , 1 , 2 , 3,... in these variables. Any of the linear 
operators 1^, A k2 , A k3 , A k0 applied to ) F gives zero. 

77 . The supplementary conditions 

We must now go back to the second of the Maxwell equations (13), 
which we have ignored so far. We cannot take this equation over 
directly into the quantum theory without getting inconsistencies. 
The left-hand side of this equation does not commute with A„(x'), 
accor ding to the quantum conditions (30), so this left-hand side 
cannot vanish. The way out of the difficulty was shown by Fermi.f 
It consists in adopting a less stringent equation, namely the equation 

(dAJdxJ |> = 0 , (43) 

and ftflaiimirig it to hold for any corresponding to a state that can 
t Fermi, Reviews of Modem Phy*k* % 4 (1932), 125. 
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actually ooour in nature. There is one equation (43) for each point 
in space-time and these equations must all hold for any ket corre- 
sponding to a state that can actually occur. The ket in (43) does not 
depend on t, since we are using the Heisenberg picture, in which each 
state corresponds to a fixed ket. 

We shall call a condition suoh as (43), which a ket has to satisfy to 
correspond to an actual state, a sup pler nentary condition. The exis- 
tence of supplementary conditions in the theory does not mean any 
departure from or modification in the general principles of quantum 
mechanics. The principle of superposition of states and the whole of 
the general theory of states, dynamical variables, and observables, 
as given in Chapter H, apply also when there are supplementary 
conditions, provided we impose a further requirement on a linear 
operator in order that it may represent an observable, namely the 
requirement that, when it operates on any ket satisfying the supple- 
mentary conditions, it changes this ket into another ket satisfying 
» the supplementary conditions. We have already had an example of 
| supplementary conditions in the theory of systems containing several 
} similar particles. The condition that only symmetrical wave func- 
> tions, or only antisymmetrical wave functions, represent states that 
can actually occur in nature, is precisely of the same type as condition 
(43) and is what we are now calling a supplementary condition. In 
this theory the further requirement on linear operators in order that 
they shall represent observables is that they shall be symmetrical 
between the similar particles. 

When we introduce supplementary conditions into our theory we 
must verify that they are not too restrictive to allow any ket at all 
to satisfy them. If we have more than one supplementary condition, 
we can deduce further supplementary conditions from them by taking 
P.B.s of the operators in them; thus if we have 

u\> = a, F|> = 0, (44) 

we can deduce 

[U,r\ |> = 0 , [U t [u,r\] I) = 0 , (45) 

and so on. To verify that our supplementary conditions are not too 
restrictive, we have to look into all the further supplementary condi- 
tions obtainable by this procedure to see that they can be satisfied, 
which we can usually do by showing that after a certain point the 
further supplementary conditions are all either identically satisfied 
or repetitions of the previous ones. 
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To apply this procedure to the supplementary conditions (43), we 
work out the P .B. of two of the linear operators dA^/dx^ say those 
at the points x and x' in space-time. We have from (30) 

f&yx) a4„(x')l _ ^ 0»A(x-x') ? 0*A(x-x') 

L te, ’ K J ~ ^~^dx v 

= — DA(x— x') = 0 

from (11). Thus the conditions (45) are all identically satisfied, so our 
supplementary conditions are not too restrictive. 

We should verify also that the supplementary conditions are con- 
sistent with the equations of motion, in the present oase with the first 
of equations (13). This consistency is immediately evident in the 7 
quantum theory, as in the classical theory. 

Since the second of equations ( 1 3) is not valid and has to be replaced 
by a supplementary condition, any consequences of this equation in 
the ordinary Maxwell theory will not be valid in the quantum theory 
and will have to be replaced by supplemental conditions. The 

■ equationfi div^f = 0, dJf/dt = —curls (46) 

follow simply from the equations defining £ and M in terms of the 
potentials, namely (35), and are therefore valid also in the quantum 
theory. The other Maxwell equations for empty space, however, 

namely • divS = 0, eS/a = curlJV, 

or ajjja*,, = o, 

can be derived only with the help of the second of equations (13), as 
one sees at once if one substitutes for its value given by (35), and 

are thus not valid in the quantum theory. They must be replaced by 
{div£}|> = 0, {0£/&-curlJV}|> = 0, (47) 

holding for any |> corresponding to a state that can actually occur. 

The field quantities £ and M at any point in space-time commute 
with all the operators in the supplementary conditions, since from 
(35) and (30) 

r*. ,*) = I M(x) H.W. BA x (x')] 

[VW>— J L a** to? ’ 8x' x J 

_ a* A(x-x') a* A(x-x') a s A(x-x') a*A(x-x') 

0yX e&dx'i tofdx'x Sx?dx' v d&Sz’i* 

It follows that if £ or Jf is multiplied into a ket satisfying the 
supplementary conditions, it will give another ket satisfying the 
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supplementary conditions, and hence it fulfils the new requirement 
for being an observable. The potentials do not satisfy this require- 
ment. 

By making a Fourier resolution of the left-hand side of equation 
(43) we get the equations 

|> = 0, ^ |> = 0 (48) 

holding for all values of the 4-vector k with kHc^ = 0 and k 0 > 0. 
This is another form for the supplementary conditions. The P.B. of 
the operators and &M k/A here, of course, vanishes, as may be 
verified directly from (23) or (27). 

To examine the consequences of equations (48), let us work with 
discrete k-v&lues and consider first one particular k-value for whioh 
k 1 — k 9 = 0, k s =s k 0 > 0, as we have done on previous occasions. 
For this k-value equations (48) become 

MkO — -^k8)l) * C^kO — -^ks) 1^ = (^®) 

Multiplying the first of these on the left by (-J k0 +Z k8 ) and the second 
by (A kQ +A kz ) and adding, we get 

(Z k0 u4 k0 -f-^4 k0 Z k0 —Z k 3^ k 3--^ k 32i k 3)|> = 0 
or 2 (A k0 A kQ —A kz A kZ ) | > = 0 

with the help of (22). This shows that the energy in the two longi- 
tudinal degrees of freedom for this k-value, namely expression (39), 
vanishes for any state that occurs in nature. The same result holds 
for all k- values. Thus the supplementary conditions ensure that the 
negative energy in any A k0( degree of freedom is always exactly cancelled 
by the positive energy in the corresponding A^ degree of freedom. 

Let us express the |> in (48) in the form 


l> = 0>jr, 

where } F is the standard ket for the field of radiation introduced in 
the preceding section, corresponding to zero energy in each degree of 
freedom, arift ^ is a power series in the operators ^4 U , A k9 , A kz , A k0 . 
Since A M y F = A*z> F = 0, we get from (49), for the k-value to whioh 
these equations refer, 

(^kO^~^Mko)>.F — -^k8 (A k &lp—ipA k a)y F = A ko^>F* 

With> the help of the commutation relations (22), these equations 
reduce to 



4tt* dA a ?F 
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showing that iff is of the form 

where is independent of J k0 and Applying this argument to 
all k-values, we find that ^ is of the form 

if, e**$*»**t*** X9 ? ( 50 ) 

where x involves only the transverse components of A k . In terms of 
continuous k-values 

\ft = g4ir* S A*K]hk9.&k ^ (gjj 

We see in this way that the supplementary conditions fix the form 
of the wave function if/ so far as concerns the longitudinal degrees of 
freedom . Thus the longitudinal degrees of freedom cannot play any 
important role in the dynamical theory. This corresponds to their 
not being of physical importance. Their only purpose is to give the 
theory a relativistic setting. The important part of \jt is the factor \ 
referring to the transverse degrees of freedom. This factor is the 
same as the wave function in the theory of a field of radiation without 
interaction with matter given on pp. 240-2. 


78. Classical electrodynamics in Hamiltonian form 
The foregoing theory must now be extended to take into account 
the interaction of the field of radiation with matter. This involves 


particles interacting with the electromagnetic field. Let us first con- 
sider this dynamical system classically and see how to put its equa- 
tions of motion into Hamiltonian form. We shall then have a basis 
from which to build up a quantum theory by analogy. 

Each of the charged particles will describe a world-line in Bpaoe- 
time in the classical theory. We give the particles labels and 
denote the coordinates of a point on the world-line of the $th particle 


by z^. These coordinates are functions of the proper ^Jime s i of the 
Ith particle, this proper-time being defined so that its difference for 
two neighbouring points on the world-line satisfies 


dsf = (dz if dz { ), dzu/dSi > 0. 

The velocity 4-vector v< of the tth particle is defined by 

Vf == dz i fds i 

and satisfies from (5$) 

* 1 . *«> 0 . 

v 


m 

(63) 

(64) 


a»MJ7 
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The presence of charges changes the Maxwell equations (13) to 

= 0 , ( 55 ) 

where is the 4-vector whose time component is the charge-density 
and whose space components are the current density. For mathe- 
matical simplicity we suppose the charge on each particle to be con- 
centrated at one point. Then j h vanishes everywhere exoept on the 
world-lines of the particles, where it has singularities which can be 
described in terms of 8 functions. The solution of (55) can be written 
in thefozm . (M) 


where A^ are the potentials of the incoming held of radiation which 
acts on the particles and are the retarded potentials of the tth 

-particle, the summation in (56) being over all the particles. The 
potentials A Jl in satisfy the equations for no charges, equations (13), 
and the are given by 

^ = e < V/( v <> x ~ z <)- (57) 


e t being the charge of the tth particle, and the variables v it z i in (57) 
being taken at the retarded proper-time s { of the tth particle, for which 

(x-z„ x-z { ) = 0, X M -Z M > 0. (68) 

As the equations of motion for the tth particle, we shall take 
Lorentz’s equations 


MidVfiilfoi = + i^pvijcet PJn^adv}* ( 69 ) 

m i being the mass of the tth particle, F ^ n and F^^ being the fields 
derived from the potentials A^ and A^^ in accordance with (35), 
and being similarly the field derived from the advanced 

potentials «fl^ad T given by (57) and (58) with the inequality in (58) 
reversed. Tie field functions on the right-hand side of (59) are all 
to be taken at the point x = z i where the tth particle is situated. 
The summation in (59) is over all the particles exoept the tth and 
shpws that all the other particles act on the tth through their retarded 
fields. The fields F flvi;tet and J^^ad T are infinitely great at the point 
x = z<, but their difference is finite, and this difference occurring in 
(59) gives the effect of radiation damping on the motion of the 
partiole.t 

f For » derivation of Lorents’e equations in the form (59) anda discussion of their 
vsUtf&P’ sad consequences, see Dime, Proe. Roy, Soo, A 167 (1988), 148. 
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Our problem now is to put the equations of motion (59) into the 
Hamiltonian form. Let us first discuss in general terms what we 
should expect the Hamiltonian form to be like in a relativistic theory. 
We should not keep precisely to the form (14) or (15) of § 28, sinoe 
this puts the time on a different footing from the space coordinates. 
We should expect to have the proper-time appearing as independent 
variable, and sinoe eaoh particle has its own proper-time we must 
then have several independent variables. Each dynamical variable ( 
is thus in general a function of the proper-times 8 i of all the particles 
and has a value only with respect to a particular jpoint on the world- 
line of each particle. The general concept of a P.B. satisfying the 
laws (2) — (6) of § 21 can be retained in a relativistic theory. We shall 
need one Hamiltonian for each particle, the relativistic Hamiltonian 
O i of the ith partiole determining how dynamical variables vary with 
the independent variable 8 if according to the equation 


d$ld8 t = [£,0 i ]. (00) 

In order that the various equations (60) for different i shall be con- 
sistent they must make 


d 2 £/d8 { (foj = d 2 £ld8jd8 if 

which requires that 

or [[<?<. <M = 0, (61) 

from (6) of § 21. This most hold for any dynamical variable £, so we 
[(?*, Cfy] = a number. (62) 


must have 


Equations (60) and (62) give the general Hamiltonian form of the 
equations of motion in a relativistic theory of several particles. 

Let us consider the dynamical variables for our system of several 
oharged particles interacting with the electromagnetic field. The four 
coordinates of the tth partiole will provide four dynamical vari- 
ables, the time coordinate being treated on the same footing as the 
three space coordinates. The four components of the momentum- 
energy 4-vector of the ith partiole will provide more. As the obvious 
generalization of the P.B. relations between coordinates and momenta 
in non-relativistio dynamics, we assume 


[V z **l = °. = o, [2 = (®3) 
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The variables z^, p^ should depend only on the proper-time s* and 
should be independent of the proper-times ^ (j =£ i) of the other 
particles, so from (60) we must have 

= = 0 (j ^ »). (64) 

We need also dynamical variables to describe the field. We take 
these to be the potentials A^x) at all points in space-time. The 
4-vector x here should be looked upon as a parameter labelling these 
dynamical variables, there being four of them for each x. Each of 
these dynamical variables A^(x) is a function of the proper-times s t . 
Thus all the A^fx) variables together provide a set of potentials 
throughout space -time depending on a point on the world-line of each 
particle. These potentials are therefore not the same as the Maxwell 
'potentials «s^( x ) satisfying (66). We shall call them the Wentzel 
potentials, f They are closely related to the Maxwell potentials, as 
will appear later. 

Since a particle variable and a field variable refer to different 
degrees of freedom, their P.B. must be zero, i.e. 

[V = °’ !>/*<’ = °- (66) 

We need also the P.B. of two field variables. A value for this P.B. 
is provided by the theory of radiation without interaction with 
matter, namely by equation (30) considered classically. This equation 
as it stands, however, is not a satisfactory one to use when there are 
charged particles present, as it causes certain infinite terms to appear 
in the equations of motion of the particles. One must replace it by 

[i M (x),.4 r (x')] = ^ v {A(x— x'+X)+A(x— x'— X)}, (06) 

where X is a small 4-vector lying within the light-cone, i.e. 

(X,X) > 0, (67) 

and is ultimately to be made to tend to zero. One must not make 
X — ► 0 too early or one will get infinite terms appearing in the equa- 
tions. With finite X the theory is not relativistic, as the direction of X 
provides a preferred direction in space-time, but it will be found that 
asX-*0 the equations of motion become independent of the direction 
of X, so long as (67) is satisfied, so that in the limit the theory is 

. f These potentials were first used to give Lorentz’e equations of motion by WentceL 
*.f. Physik, 84 (1933)/479 

-Jvt 
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relativistic. Equations (63), (65), and (66) give the P.B.S of all our 
dynamical variables. 

We must now set up the Hamiltonians. We shall assume that 

G * = SjW~(P <-e< A(z<))} («8) 


and shall verify that these Hamiltonians lead to the correct equations of 
motion. Let us first test for consistency. We find from (63), (65), and 


(66) that 


[flfcflB-o 


(69) 


provided the conditions 


(z<-z,±X, z<-z,± X) <0 (i ^ j) (70) 

are fulfilled. These conditions mean that the independent variables 
8 t are not completely independent, but must be restricted so that the 
points which they specify on the world-lines of the various partioles 
each lie outside the light-cones with the others as vertices (and remain 
so when shifted by the amount ±X). Subject to these conditions the 
equations of motion are consistent. The dynamical variables should 
now be considered as undefined for values of the 8 t which do not 
fulfil (70). 

Let us consider now the equations of motion. We see at once that 
equations (64) are satisfied. Putting £ = in (60), we get 


which is the usual relation between velocity and momentum for a 
oharged particle. Prom (54) and (68) we see now that 

0 { = 0. (72) 

Equations (69) show that the O t are all constants of the motion and 
(72) shows that we must take these constants to be zero to get the 
equations of motion that we want. Putting £ = p^ in (60), we get 

which reduces, with the help of (71), to 
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This would be the same as Lorentz’s equation (59) if we could arrange 
to have 

for x in the neighbourhood of z { . Finally, putting ( — A^x) in (60), 
we get dA , , 

= %{&-e t A''(z i )lA ll (y),A v (z i j\ 

' = fot^Afx-z^XJ+Afx-z^X)}. (75) 

These equations for all i can be integrated to give 

4*(*)=2te J ^{A(x-z;+X)+A(x-z;-X)}^+a/x), (76) 

' —go 

where vj, z\ are short for v^sj), z < (^), and a^x) is a constant of the 
motion for each x and ft, i.e. it is independent of the 8 { . Equation (70) 
shows the form of the Wentzel potentials -4 M (x) as functions of the s*. 
These equations, it should be remembered, hold only for values of the 
8 t satisfying (70); for other values of the 8 t the Wentzel potentials are 
undefined. 

In order to see the significance of (76), let us study the integral 

j‘ ^A(x-z;) ds\. (77) 

—00 

If the point x lies inside the future part of the light-cone of z { at the 
proper-time 8 i9 i.e. if 

(x Z{, x-z<) > 0, Xq Zqi > 0, (78) 

then (77) vanishes, since the A function vanishes throughout the 
domain of integration. If the point x lies outside the light-cone of 

Z * i>eif (x-z 1 ,x-z < ><0, (79) 

there, is just one value of in the domain of integration for which 
the A function does not vanish, namely the retarded proper-time for 
the .field point x. The integral (77) is then equal to, with the help 

of («), 
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where pia& positive number. The integral now becomes 

/ rcfeg «-* «*-< *-*> - 

—p 

taken at the retarded proper-time. Thus from (57) 
p 

e i j v^A (x-zj) ds'i = ^ <jret (x). ( 80 ) 


If the point x lies inside the past part of the light-cone of z it i.e. if 
(x-z<, x z^) > 0, z 0 -Zm < 0, (81) 

there are two values of ^ for which the A function does not vanish, 
namely the retarded and advanced proper-times. The contribution 
of the retarded proper-time to the integral (77) is the same as in the 
preceding case; the contribution of the advanced proper-time may 
be worked out by the same method and is, when multiplied by e i9 
—•sfpi, ady( x )« Summing up our results, we have 


e i J v'^Aix—zl) ds\ = 0 


= ^W x ) 

~ * fi ^i< l ret( x ) *^Jxi,adv( x ) 


when (78) holds,) 


when (79) holds, 


(82) 


when (81) holds. / 


Substituting the results (82) with x±X for x into (76) we find, for 
x very close to z i (close compared to X), taking into account (70) and 
(67) and taking Aq > 0, 

-4/a( x ) =^K^^( x +^)+^^et( x “-^)}+ 

+i^W x “ x )~^W x - x >+ a /i( x )- 

If we take a^x) = ^^(x), (83) 

this agrees with (74) in the limit X = 0. Thus the choice (83) /or the 
constants of the motion o^(x) — a choice which is permissible since 
neither side of the equation depends on the s t — results in the equations 
of motion for aU the particles becoming the Lorentz equations in the limit 
X = 0. 

The ingoing potentials A r1vf must, satisfy the equations (13) but are 
otherwise undetermined. Thus the constants of the motion a M (x) 
must satisfy 


□ Op(x) = 0, da^xj/dXp = 0 


( 84 ) 
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but are otherwise arbitrary. Inserting these conditions in (76) we 
find, with the help of (11), 

OA^fx) = 0, (85) 

* 

^^ = 2 J t>; 4 J-{A(x-zi+X)+A(x-<-X)}d«J 
** * —00 ^ 

* 

= “2ie< J V^-{A(x-zi+X)+A(x-z;~X)} (fcj 
* —00 ^ 

= -2*®* / ^{ A ( x - z <+ x )+ A ( x ~ z i-x)} 

* —CO 

- - | WA(x-z<+X)+ Afx-z^A)}. (86) 

The work of this section can be summed up as follows. To describe 
a number of charged particles interacting with the electromagnetic 
field we need the dynamical variables z^, p^ iy A^x) satisfying the 
P.B. relations (63), (65), (66). The equations of motion then take the 
Hamiltonian form (60) with the Hamiltonians O i given by (68), pro- 
vided one imposes certain conditions on some of the constants of the 
motion, namely the <7/ s must vanish and equations (85) and (86) 
must hold. 

The equations (85) and (86) for the Wentzel potentials should 
be compared with the equations (55) for the Maxwell potentials 
Of the two equations (13) satisfied by the electromagnetic potentials 
in the absence of ohazges, the first gets modified by the presence of 
charges in the case of the Maxwell potentials and the second in the 
case of the Wentzel potentials. For a field point x lying outside the 
light-cone of all the electron points z<, each of the integrals in (76) 
is given by (80) and the right-hand side of (76) becomes equal to the 
right-hand side of (56) in the limit A = 0. Thus for this domain of x 
the Wentfcel and the Maxwell potentials are equal. 

79. fruitage to the quantum theory 

Let us now construct a quantum theory analogous to the classical 
theory ofthe preceding section. We use the same dynamical variables 
as before, namely the particle coordinates and momenta p^ and 
the Wentzel potentials A^x), and assume them to satisfy quantum 
conditions corresponding to their having the same f’.'B.s as in the 
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classical theory, given by (63), (65), and (66). The classical Hamil- 
tonians (68) should be replaced by Hamiltonians of the form given 
in the preceding chapter, applying to particles with a spin in order 
to get satisfactory relativistic wave equations. Thus we must intro- 
duce further dynamical variables to describe the spins. For the tth 
particle we need the spin variables (r = 1, 2, 3) and a^, which 
anticommute with one another and have their squares equal to unity, 
and which commute with all the z^, and A^x) variables, and 
also with the spin variables of the other particles. We can then set 
up Hamiltonians of the form of the operator in (9) and (10) of § 67, 

&i = P<H— e i A o( Z i)+(*i> Pi~ e i K i )+<Xmi in i> ( 87 ) 

to replace the classical Hamiltonians (68), being written instead 
of A (z { ) in the three-dimensional scalar product. 

We describe a state of motion of the whole system of particles and 
field by a wave function in the coordinates and times of the 
particles, which wave function is a ket in the other degrees of freedom, 
i.e. those of the field and of the spins of the particles. Following the 
notation of the end of § 20, we write, this wave-function-ket as |z>. 
It must satisfy the wave equations 

G i |z> = 0, (88) 

which may be looked upon as supplementary conditions correspond- 
ing to the classical equations (72). For the various equations (88) to 
be consistent we need, by an application of (45), 

[<?*,«,] |z> = 0, (89) 

a rather more stringent condition than the classical consistency con- 
dition (62). With the Hamiltonians (87), [#*, 0$] = 0 when (70) holds 
a n d the condition (89) is then satisfied. The conditions (70) can be 
brought in by supposing that |z> is defined only for values of the 
z-variables satisfying (70), so that it is only in this domain of defini- 
tion of |z> that equations (89) have to hold. The wave equations 
(88) are consistent in this domain. 

The remaining equations of the classical theory, equations (85) and 
(86), must now be taken over into the quantum theory. Equation 
(85) may be assumed to hold unchanged in the quantum theory, as 
it does not give rise to any inconsistency because its left-hand side 
commutes with all the dynamical variables. Equation (86) must be 
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replaced by a supplementary condition, as otherwise it would lead 
to inconsistencies. Defining X(x) by 

5(x) = 0A M (x)/az M + | ^(x-^+XJ+Afx-Zi-X)}, (90) 

we take the supplementary condition 

<R(x)|z> = 0 (91) 

holding for all x as the quantum analogue of the classical equation 
(86). It is a generalization of the supplementary condition (43) for 
no charges. We have, using (66), 

[*(*)> fyi-etApfa)] = -dR(x)/d^-e i [R(x), A^)] 

= — i«<^:{A(x-z < +X)+A(x-z i -X)}— 

-i«<g^{ A ( x -Zi+X)+A(x-z<-X)} 

= 0, (92) 
bo that from (68) [-B(x), (?<] = 0. (93) 

Thus the supplementary conditions (88) and (91) are co nsis tent. 
Again 

[R{x),R(x')] = ]™^, d A0^ 

= W a (x-x'+X)+A(x-x'-X)} 

= -iD {A(x-x'+X)+A(x-x'-A)} = 0 (94) 

from (11), so that the various supplementary conditions (91) obtained 
by putting different values for x are consistent with one another. 

We now have the complete scheme of quantum equations corre- 
sponding to the classical theory of the preceding section, namely the 
£.B. relations (63), (66), and (66) together with the equations (86), 
(88) ft and (91), and have verified that they are all consistent for the 
domain of the z*s for which (70) holds. If some of the particles are 
oflthe same kind and are bosons or fermions, the further conditions 
must be imposed that |z> is symmetrical or antisymmetrioal, as the 
oase may be, between the coordinates (and spin variables) of the 
similar particles. 

The wave-function-ket |z>, if normalized, has the physical inter- 
pretation that {z|z^ is the probability, per unit t^ree-dimensional 
volume for each particle, of each partiole being in the neighbourhood 
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of the place fixed by its coordinates z^^z^ at the time z w . The 
theory allows one to calculate this probability, for any state of 
motion of the system, only provided the conditions (70) are satisfied, 
which means, in the li mi t A = 0, that the points in space-time mus t 
each be outside the light-cones of the others. The observations of 
whether the particles are at the places z^, z iit z^ at the times are 
thus compatible observations only provided the points z { in space- 
time are outside each other’s light-cones. This result of the theory is 
to be expected on general physical grounds, since the observation of 
whether a particle is at a particular place at a particular tim e may be 
expected to produce a disturbance throughout that region of space- 
time lying inside the future light-cone of the particular place and time. 

Equation (85) enables us to resolve the potentials into their Fourier 
components according to 

4 m (x) = J {A^ e'ta>+j; k(i d 3 k (96) 

with * 0 = |k|, (90) 

as in the case of no charges. The Fourier coefficients A kfl no longer 
satisfy the commutation relations (27) on account of the occurrence 
of X in (66). They still satisfy (28) and instead of (27) they satisfy 

= “^ v /47T a . hk 0 cos ( kX) S 8 (k— k'), (97) 
as may be verified by noting that (28) and (97) lead to equation 
(29) with the extra factor cos(kX) in the integrand and this extra 
faotor makes equation (29) lead to (66) instead of (30). 

It is convenient to redefine A kfl for those values of k for which 
cos(kX) is negative so that 

new Aty = —old 1^. 

Thus the new Fourier coefficient A^ exists when k 0 oos(kX) > 0. 
With X very small, the redefinition affects only Fourier coefficients 
with very large k-values. With the new A^ equation (95) still holds 
if (96) is replaced byf 

k 0 = |k||oos(kX)|/oos(kX) (98) 

t If X does not lie along the time axis there are some regions of fa k % jb,)-spaoe for 
which there is no satisfying (98) and others for which there are two. The integral 
(95), and similar integrals in the future, are then to be understood as farimn over the 
domain of fa k % h^space for which (98) has a solution and as summed over both 
values of the integrand for that part of the domain for which (98) has two solutions. 
Prom the four-dimensional point of view, the domain of integration is that part of 
the light-cone (kk) a 0 for which k 9 oos(kX) > 0, and is Lorentz invariant for a 
given X. 
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and (97) holds unchanged. The right-hand side of (97) with k = k' 
is now always positive for p = v = 1, 2, or 3 and negative for 
p = v = 0. This enables us to express any ket in the degrees of 
freedom of the field as a power series in the variables 
and multiplied into the standard ket } F corresponding to no 
energy in each of the degrees of freedom, as we had at the end of § 76. 
Expressing the wave-function-ket |z> in this way, we have 

|z> = 0> F (99) 

where 0 is a power series in the variables A kv A ki , A ks , A k0 , whose 
coefficients are each a wave function in the z-variables and a ket in the 
spin degrees of freedom. These coefficients correspond to there being 
different numbers of photons in the various degrees of freedom of the 
?teld. 

80. Elimination of the longitudinal waves 
The electromagnetic field in the foregoing electrodynamical theory, 
both classical and quantum, involves longitudinal waves as well as 
transverse ones. The potentials A^x) may be expressed as 

A^x) = L^W+M^x), ( 100 ) 

where L^x) are the potentials of the longitudinal waves and M^x) 
those of the transverse waves. The longitudinal waves are made up 
of the components A k0 and A& of the Fourier component A^, as 
discussed in § 76. Here A^ is the component of the three-dimensional 
vector A& (r = 1, 2, 3) in the direction of the three-dimensional vector 
k ri so that, expressed as a three-dimensional veotor, it equals 
(kA k )Jfc r A;o *• Thus 

io(x) = A 0 {x), 

L r (x) = J {(kA k )e 1(kx) +(kX k )e- <(k *>}fc r 3 d»k. (101) 

These equations fix the longitudinal part of the potentials, and the 
transverse part is then fixed by (100), i.e. 

JMi(x) = 0, M,(x) = A r (x)—L r (x). (102) 

The longitudinal waves are not physically important. They can 
be eliminated from the equations by a oertain mathematical trans- 
formation, which forms a generalization of the method whioh led to 
equation (61) for the case of no charges. The equations are thereby 
simplified and brought into more direct connexion with experiment, 
but they lose their relativistio form, as the separation^ the field into 
longitudinal and transverse waves is not Lorentz invariant. 
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By making a Fourier resolution of the left-hand side of equation 
(91) we get, with the help of (10), the equations 

{kPAty— oos(kX)/47r*. ^^e-****'*}^) = 0, 

oos(kX)/4w*. ^ = 0, 

forming the generalization of (48). If for the moment we take discrete 
k-values, the commutation relations (97) become, from (26), 

A^ A^—A^A^ = —guy/Arr* . ftk 0 oos(kX)« k 8^, .(104) 

and show us that, with the notation (99), 

A ,/A _** 0 oos(kX)« k 80 . T _ (A M 0 cos(kA)s k 80 v 

Am * >f u -y F , A^y F —y r . 

(106) 

Thus equations (103) become, on multiplication by 47r a /oos(kX), 

“*^H;^) (kA ‘ )+ ?' °- 

? e * e “ w )i +>' - 

These equations holding for all k show that 0 is of the form 

0 = e s '*x i, (106) 

where 

' 8 = Icq •«f 1 {4w*(kA k )ri k0 /oos(kX)-l- 

+ f e( [J k0 e-«^>-VMkA k ) e «^)]} 

and Xi is independent of A kK and -4^. Passing baok to continuous 
k-values, we find that $ is still of the form (106) with 8 given by 

8 = J {47T 1 ( kA k )Z k0 /cos( kX) + 

+ 1 kAJe^JJifeo* d*k. (107) 

Thus, as in the case of no charges, we find that the form of the wave 
function $ is fixed so far as oonoems the longitudinal degrees of free- 
dom. The important part of 0 is the factor xv which involves only 
the transverse components of A&, together with the z’s and spin 
variables. 

We may look upon Xi M a wave function from which the longi- 
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tudinal waves have been eliminated. We can obtain wave equations 
for Xi in the following way. We have 

Prt ePl* = tP^Puj+i dSjdtf 

0 

= 0%,+e, j J k0 e-«k%>+i; 0 -i(kA k )e«<*%)JA 0 -» g»k. (108) 

Using this result for p — 0, we get 
{Py— e jL 0 (z J )}<l<>F 

= {Poi-ej j [A k0 e«*s>+ l k0 e-^ko 1 dfikyi* Xl > r 
= ^ li Po i Xi>r+e i j [(kAJ-koAnyWko'tPkePItx^ir 
= ^PojXi^F—ejl*"*- 2«< J cos(kA)c < < k -%-«i)^a d?k ^ ik Xi>f 

with the help of the first of equations (103). Again, n«in E (ioi) and 
(108) with p = r, we get 

{ Vri-*iL&iiW>F 

= (Pn-ejf [(kA k )e«**>+(kX k )e-«**)]k r kf> cPkje 8 ^^ 

= ^ !h Pri Xi >F+ej j [k 0 I k0 - (kX k )]e-«k%)* r V d?k e 8 » Xl y F 

= ^Prj Xi '>F+ e jl 4f " t ‘ £«< J oos(kX)e <{k «*i ~v>k r k^ 3 d 3 k e?l*xi>F 

with the help of the second of equations (103). These equations may 
be combined as 


{Ppf- e jL*(*i)W'>F = e f3l *{p lt i-e i B lt (z J )}xi>r> (109) 

wnere 

-Bo(x) = l/4w*. ^ c< J oos(kA)e <<k -*-^>A5’* d*k, 

-®r( x ) = — 1/4 j t*. ^e ( j coa(kX)e- {< *-*-*>k r k^ 3 cPk. 

He equations may be simplified by a further transformation, na mel y 


Xi = e T/ *x, 

where 

T= -l/8v*. £e<e, J cos(kA)oos(k, z^z,)/^* d*k. 
Equations (109) go over into 

= ^^{Prt-ejB^+idT/dztteyr 
■ = z,)}x>f,’'' 


( 110 ) 

( 111 ) 


( 112 ) 
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where 

M*) = ^(x)-i/4^. |e,J cos(kX)sin(k, x—z^k^ 1%* <Pk 

= ±l/4w*. ^e < J cos(kX)oos(k, x—z^k^k^ 3 i»k, (113) 

the + or — sign being taken according to whether fi is zero or not. 

With the help of (100)' and (112), the wave equations (87), (88) go 
over into 


{Pa}— e ] b o( z j)+(<*j, Pj—' ejbq —' *%}*>*• = o. (114) 
The variables describing the longitudinal waves have all disappeared 
from these equations. We may take x as the wave function for the 
theory in which the longitudinal waves have been eliminated (it is 
rather more convenient for this purpose than Xi)> and equations (114) 
are the wave equations which it has to satisfy. The influ ence of the 
longitudinal waves now shows itself up through the functions b^Zj) 
of the particle variables appearing in the Hamiltonians. The supple- 
mentary conditions (91) have been satisfied through our using (106), 
and drop out of the present formulation of the theory. 

To work out the function b^x) we must evaluate integrals of the 

form J„(x) = j oos(kx)^V^k (115) 

for a general 4-veotor x, with k 0 given by (98). Since the integrand 
in (115) is unchanged when — -k is put for k, the integral is equal to 

I^x) ~ oos(kx)fc M kg 3 (Pk, 

where | means summing over both values ± |k| for k 0 . Thus 2J,(x) 

equals 7^(x) = i J A(k)cos(kx)fc ;t fco * d* k. 

This integral may be evaluated most conveniently from formula (10), 
whioh gives us, on taking the real part of both sides, 

i J A(k)sin(kx) d 4 k = 2*r*A(x) 

= ^Ixl-^o-lxD-S^o+lxl)}. 
Integrating both sides here with respect to sr 0 , we find 

Ux) = i j A(k)oos(kx)i 0 -i d‘k = 0 for (xx)>0,j 

= 2w*|x| -1 for (xx) < 0, J ' ; 

the constant of integration being fixed by the condition that 2J,(x) 
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vanishes for x 0 -> ±00 with x v x %i x 8 fixed. Integrating (110) with 
respect to x 0i we find 

i J A(k)sin(kx)A5“* d 4 k = — 2^ for (xx) > 0, x 0 < 0, 

= 2ir t x 0 \x\ mA for (xx) < 0, 

= 2n 2 for (xx) > 0, x 0 > 0, 

the constant of integration being fixed by the condition that the 

integral vanishes for x 0 = 0. Differentiating with respect to x r , we get 
* 

Ux) = ij A(k)cos(kx)* r *o*rf«k 

= 0 for (xx) > 0, 

= 27r%a: r |x|~ 8 for (xx) < 0. 
tJsing the results (116), (117) in (113), we get, with reference to (70), 


(117) 


^ z <) 4 2 e, {| Zj — z,+X| + 


. Ao)(g rJ — g^—Ay) ! 


(118) 


The terms i *c j in the sums are zero on account of (XX) > 0. These 
terms would have been infinitely great if we had put X = 0 in (113), 
so we see here the need for not passing to the limit X -> 0 too early 
in the theory. However, it is permissible to put X = 0 in (118), so 
we may take 


&«(%) = Igftl 

= J ^ e i( Z 0j~ Z 0i)( z *j Z ri)I\' Z ‘] *<!*• 


(119) 


The relativistic form of the theory has been spoilt by the elimina- 
tion. of the longitudinal waves. There is now not much point in 
retaining different time variables % for the different particles. By 
putting all the 2 q’s equal tot we can get a further simplification of the 
equations. We have in the first place 6 r (Zy) = 0. We can write the Wave 
equations (114) a« *8^.^ 

Hj * €jb 0 (**)— (a*, Pj — M^) — . 
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We . then have . * 

<,2o) 

Thus the wave function Xz%»t satisfies one wave equation, in whioh 
the Hamiltonian is the sum of the Hamiltonians in the many-time 
formulation. 

The total contribution of the 6 0 (z,) terms to the Hamiltonian 
TH,is 

|>A( z j) =^WI z r- z <l- * (121) 

This is precisely the Coulomb interaction energy. Thus the longi- 
tudinal waves get replaced by the Coulomb interaction energy in the 
single-time formulation of the theory . We can now see the real signifi- 
cance of the longitudinal waves of the Wentzel field. They are to 
enable one to bring the Coulomb forces into electrodynamics in a 
relativistic manner. 

A further transformation of the wave equation is of interest. Let 

(122) 

where H R is the Hamiltonian of the field in the absence of charges, 
given by (41), and let us consider Y as a new wave function. It 
satisfies the wave equation 

ihdW/dt = (H r + | Hf)Y t (123) 

Hf = 

= fy& 0 («^)— -(Cty, 

M*(x) = e~ iI ** i l*M r (x)e iH *M 
If we express M^x) in terms of its Fourier components 

J£(x) = J d» k> (124) 

being the part of the three-dimensional vector perpendicular 
to k r , then we have, with the help of (42) and (1), 

Mf(t, x v x t , x 9 ) — J {M*. d*k. (126) 

Thus M*(t, x v x ti z # ) is a function of the not involving t, and 

is a Constant linear operator. The Hamiltonian in the wave equation 
(128} is now constant, and the wave equation itself is of the usual 
form for an isolated system in non-relativistio theory. Further, the 

SMtST - 


where 

with 
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Hamiltonian in (123) is just what one would get with the non* 
relativistic theory of § 62 if one takes for H P in equation (53) of § 62 
the proper-energy of a set of particles eaoh with spin §£, together 
with their Coulomb interaction energy. This rather surprising result 
means that the theory of § 62 applied to partioles with spin and 
with Coulomb interaction energy is essentially a relativistic theory, 
leading to physical consequences which are invariant under Lorentz 
transformations, in spite of the form of the theory departing so much 
from the usual relativistic requirements. 

81. Discussion of the transverse waves 

Let us apply the theory of the preceding seotion to the case of a 
single particle. There is then just one wave equation (114) and the 
terms involving b drop out, so the wave equation becomes 

{Po+(aP)+“m»»}x>*. = e(«M*)xV (126) 

This is the wave equation for a single particle interacting with the 
electromagnetic field. Let us try to get a solution of it on the 
assumption that the interaction term in the Hamiltonian, namely 
e(ocMJ, is small. Such a solution would be of the fora* of a power 
series in the charge e, 

X = Xo+%+e a *a+..., (127) 

where Xo>Xv Xa>— are independent of e. Substituting (127) in (126) 
and pioking out terms of different degree in e, we get the successive 
equations {p,+(«p)+o W m} X( ^, = 0, (128) 

(Po+(«P)+«m»»}Xl>.F = («M,)Xo>F, ( 129 ) 

{Po+(«P)+«m »»}*»>*• = («MJxi>r- (130) 

A solution of (128) corresponding to the particle having the energy 
and momentum p', with (p'p') = m*, and no photons present is 

Xo = (131) 

where |a> is a ket in the spin degrees of freedom satisfying 

{Po+(«P')+«m*»}l*> = (132) 

Substituting (131) in (129) and using (124) and 

= 0, (133) 

we get 

{p#+(*P)+a«»»}Xi>jr = f (*M k )e«*-P'Mk~ 1 d*£'le)) r . 
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To solve this equation for Xv we multiply both sides by the operator 
{p 0 — (apJ—Q^tn} on the left, which gives 

{(pp)-OT*}Xl>y 

= {Po-(*P)-“m»»} / d*k|«» y 

— / {Po-^o-(«.P'-^k)- am m}(aM k )e«k-»>''«^ 0 -M»k|*» F . (134) 

The operator {(pp)— m 2 } applied to the integrand here is equivalent 
to the multiplying factor 

(_flk+p', -fck+p')-™ 2 = -2ft(kp'), 
and henoe a solution of (134) is 

Xi = -i*- 1 / (kp')-Hpi-^ 0 -(«,p'-«k)- am m}(aM k )x 

Xe «k-P-/A^-I^k| 5 >. 

This xi is linear in the M& variables and corresponds to one photon 
being present. Substituting this xi into (130), we see that Xa is of the 

form x 8 = xM 0) , 

where 

[Po+(«P)+“mW l }xSi sl) >j' = J (aMfcOeW*)^- 1 i»k' Xl ) F , (136) 

{p,+(«p)+a m m} x g»> i , = | (oU^e-W^-^k'^V (136) 

The right-hand side of (135) is quadratic in the variables and 
leads to a quadratic x& a) , corresponding to two photons being present, 
while (136) leads, as we shall see, to a x& 0) independent of the 
variables, corresponding to no photons present. 

The right-hand side of (136) contains terms of the form 
so far as concerns the field variables. Such a term becomes, with the 
help of (133) and of the commutation relations (97), 

= -gj±rr * . hk 0 cos(kX) S 8 (k— k')> F 
if r and 8 denote directions in three-dimensional spaoe perpendicular 
to (& x k % k z ) and either equal or perpendicular to each other. Using 
this result, the right-hand side of (136) becomes 

-1/8**. JJ|^{pi-«*o-(a,p'-»k)- am mK(kp')- 1 co8(kX)x 

xe «k-k'-P'/M) g 8 (k— k')^- 1 i*kd*k'|«» F , (137) 
where the summation with respect to r refers to two perpendicular 
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directions for r which are both perpendicular to k t Jc z ). The 
expression (137) reduces to 

J| ar {pi-«fc 0 -(« ) p'-*kj-« m »»K(kp')- 1 x 

X cosfkX)*# 1 <Pk|«»,.. 

This is a divergent integral since it contains, amongst other terms, 
one involving j (kp , ) _ x ^ 

which diverges, with & 0 given by (98), even before passing to the 
limit A-*0. We can conclude that the wave equation (126) has no 
solution of the form of a power series in the charge e. This conclusion 
must hold also for the wave equation for several particles — the trans- 
verse electromagnetic waves always lead to divergent integrals when 
one tries to get a solution of the form of a power series in the charges 
on the particles. 

We have here a fundamental difficulty in quantum electrodynamics, 
a difficulty which has not yet been solved. It may be that the wave 
equation (126) has solutions which are not of the form of a power 
series in c. Such solutions have not yet been found. If they exist 
they are presumably very complicated. Thus even if they exist the 
theory would not be satisfactory, as we should require of a satis- 
factory theory that its. equations have a simple solution for any 
simple physioal problem, and the solution of (126) for the trivial 
problem of the motion of a single charged particle in the absence of 
any incident held of radiation has not yet been found. 

Quantum electrodynamics has many satisfactory features in it, 
closely analogous to various features in classical electrodynamics. 
One can get from it finite and reasonable answers for problems con- 
cerning the emission, absorption, and scattering of radiation whose 
wavelength is not too short, by cutting off the divergent integrals at 
a value for |kj of the order 2 t rm/e 1 , which cutting off means physioally 
that the contribution of transverse electromagnetic waves of wave- 
lengthJess than e a /m to the prooess under investigation is neglected. 
The wavelength e*/m is chosen for the cut-off because it is of the 
order of the classical radius of a particle of charge e and mass m on 
Lorentz’s model of the electron. The cutting off is not a relativistio 
procedure and can lead to well-defined results only for problems in 
which the important wavelengths are considerably greater than e*/m. 

It is probable that some deep-lying changes will have to be made 
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in the present formalism before it will provide a reliable theory for 
radiative processes involving short wavelengths. These changes may 
correspond to a departure from the point-oharge model of elementary 
particles which provides the basis of the present theory. Already in 
the classical theory the point-charge model involves some difficulties 
in interpretation and application, f even though it leads to well-defined 
equations of motion, as given in § 78, so it is not surprising that the 
passage to the quantum theory brings in further difficulties. 

f See Dime, Proc. Roy. Soc . A 167 (1938), 148 and Eliezer, Proc. Comb . Phil. 
Soc . 39 (1943), 173. 
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